Closed hhamud closed 1 year ago
This will be solved by #49
This will be solved by #49
I specifically want a lightweight cli version with a sliding window for context building. Maybe he could rip that part out of his PR if he's built that portion and integrate that into the library? @karelnagel
This will be solved by #49
I specifically want a lightweight cli version with a sliding window for context building. Maybe he could rip that part out of his PR if he's built that portion and integrate that into the library? @karelnagel
In my app I just keep the model running and so it has all the previous context for that session. I think it should be quite easy to implement that same logic into the cli as well with some kind of flag. I basically just stole that logic from this PR https://github.com/setzer22/llama-rs/pull/37 .😀
As mentioned on the discord conversation, the real challenge here is extending the context window beyond the current cap of 2048 tokens. But in the meantime, a chat application with a 2048 message hard cap would be interesting to have.
I've tried several experiments to achieve a sliding context window, but nothing that gave good results so far. So for now, I'm trying to build a bit more understanding of the underlying transformer architecture before I attempt this again.
Other ways of achieving a larger context window would be training a larger model from scratch (unfeasible), or fine tuning an existing model to work with a larger context window. I'm not entirely sure about the effectiveness of the latter strategy.
This is related to #53 and #76.
This now exists in the CLI, but it doesn't have a sliding window. Closing this issue and moving discussion to #77.
Creating an issue here following discussion in the discord chat.