turboderp / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs
MIT License
3.52k stars 271 forks source link

After updating exllamav2 to 0.1.0, the text generated by exui will not be pushed verbatim #490

Closed xldistance closed 2 months ago

xldistance commented 3 months ago

Rewind to version 0.0.21, exui can generate text normally and won't wait until all responses are generated before pushing them out

turboderp commented 3 months ago

Have you tried later versions? Current version is 0.1.4. Not that I think there were any changes since 0.1.0 to fix an issue like this, because the code used by ExUI has been largely untouched since 0.0.21.

Is it possible this is due to some other library being updated along with ExLlama? Maybe flask or waitress?

turboderp commented 2 months ago

Closing this as stale. Feel free to reopen if the problem persists.