edgenai / llama_cpp-rs

High-level, optionally asynchronous Rust bindings to llama.cpp
Apache License 2.0
173 stars 33 forks source link

Implement context overflow mitigation methods #36

Open pedro-devv opened 8 months ago

pedro-devv commented 8 months ago

At the moment when the max context size is reached, the program panics. Methods such as sliding context window should be implemented.

ElhamAryanpur commented 7 months ago

I didn't notice the panic on my end, using both streaming and non streaming methods.

pedro-devv commented 7 months ago

I said panic, but that's not exactly correct, what happens is an error in llama.cpp's side, I forget if it's an exception or something else. Either way, this depends on how large you make your context/session, if you make it large enough and never use the same session for too long, you probably won't ever notice it, but this definitely still happens.

ElhamAryanpur commented 7 months ago

I see, I was using ~2048 context length so I didn't notice at all