Closed Anto79-ops closed 7 months ago
Feel free to open a pull request and I'll take a look.
I was looking into supporting things like RoPE and "self extend" to extend the size of the context. Both techniques are supported by llama.cpp along with simple sliding window truncation, and they just needs to be wired up.
If you're willing to debug a little, the commits on my fork are not working 100%, and there were no logs, so it could be in the config entry directly that is causing the issue below.
@Anto79-ops thats my jank code, I know whats wrong and will try to fix
hey,
We think the issus of getting blank messages stems for the the fact that old conversations are not purged and so the context limit gets used up rather quick, This could explain why my request work on the first 1 or 2 times, but they then things become flakey after asking it a few questions.
Perhaps add the option to delete old messages on every new message request or only keep 1 for example?
thanks
Check out fork from @lunamidori5 here https://github.com/Anto79-ops/home-llm