LostRuins / koboldcpp

Run GGUF models easily with a KoboldAI UI. One File. Zero Install.
https://github.com/lostruins/koboldcpp
GNU Affero General Public License v3.0
5.23k stars 360 forks source link

When the backend crashes half way during generation. The WebUI will delete the texts that's already been generated and streamed. #499

Open WingFoxie opened 1 year ago

WingFoxie commented 1 year ago

Currently, if I generate more text, and somehow the backend (I mean that cmd window) crashes during the generation process. An error will show like this one:

Error Encountered
Error while submitting prompt: TypeError: NetworkError when attempting to fetch resource.

But I had "token streaming" on! And some of the text that's already been streamed to the WebUI, will be DELETED as well?? Please don't delete them! At least provide an option to preserve them!

Currently I just have to act fast, and try to grab the text and copy them somewhere else, before it's removed. Edit: Or I can just turn "token streaming" off for now. And turn down the "Amount to Gen." from 100 to just 50. And try to engage a conversation in that way. I can live with it.

LostRuins commented 11 months ago

Does this issue still happen?

WingFoxie commented 11 months ago

It does not happen anymore, only because I don't experience crashing anymore. The behavior when it crashes, stays the same.

Details:

If I intentionally close the console window when a response is being generated, simulating a "crash". It still drops whatever text that's already streamed into the WebUI. **When the error message pops out, the text is already deleted.** (Not sure why it keeps streaming for a few seconds after the "Windows Terminal" window's already gone, though.) (Tried v1.51.1 and v1.52 . And tried Token Streaming "On" and "SSE". Same.) It's just that I don't experience crashing anymore in the first place. So it's a "No" for myself. Since it was only during that few unfortunate days around the time I opened this issue, did the backend crash multiple times a day. I don't know why it crashes multiple times a day for a few days, and I don't know how it's fixed. (Didn't update Windows, or display driver, or KoboldCPP during the time. Didn't even reboot, yet one day it doesn't crash anymore.) (It can't be that the Windows Terminal update fixes it, right?)

LostRuins commented 11 months ago

Yeah the reason why the text disappears is because it's only saved to file when the streaming is completed or aborted.

e576082c commented 9 months ago

I also experienced this issue, the following way:

  1. Run a Mistral-7B model, with 8K context size.
  2. Set up some scenario, in story mode, roughly using 512 tokens.
  3. Set max generation length to 7K tokens. (Edit the field in the settings as the slider only allows max 512).
  4. Hit generate and go have a coffee.
  5. Return back and observe the phenomenon:

Looking at the terminal output, do notice, that the generation got interrupted halfway, then restarted automatically to continue.

Looking at the KoboldLiteUI in the browser, do notice, that the first part of the generated text (seen in the terminal) was completely deleted, and was replaced with the second part of the generated text.

LostRuins commented 9 months ago

Yeah, maybe the connection timed out. Try using SSE streaming