nomic-ai / gpt4all

GPT4All: Chat with Local LLMs on Any Device
https://gpt4all.io
MIT License
67.2k stars 7.39k forks source link

Chat gets stuck processing markdown with certain prompts in 3.0.0 #2519

Closed brankoradovanovic-mcom closed 2 days ago

brankoradovanovic-mcom commented 1 week ago

Bug Report

When one enters a certain prompts (the text from https://users.ece.cmu.edu/~gamvrosi/thelastq.html for instance), the chat gets stuck: there is apparently some processing going on, but it never produces a response.

--- UPDATE:

I've confirmed this is caused by an issue processing markdown in the ChatViewTextProcessor class. When we attempt to insert this text as markdown into the QTextDocument it produces a never ending loop of QTextDocument::contentsChanged signals to be emitted.

cosmic-snow commented 1 week ago

There's a settings for Context Length and the default is 2048 tokens. If you have enough RAM/VRAM, you can increase that.

Note: it also depends on the model whether it is able to handle more than that, based on how it was trained. For some, their capabilities deteriorate quickly with higher context lengths. You may need to inform yourself, first (model card, description, release announcement, or similar).

I don't exactly know why you'd say it worked before the 3.0.0 version. My understanding is that it would crash if the prompt exceeded the context (might be wrong there, however, I always avoided that).

Edit: Did a test with a reduced context length (note: not 4k as in the name I gave it), and this is what you should be seing as a response: image

I'd say you need to add more information on what you're doing to arrive at that problem.

manyoso commented 1 week ago

Any chance you can upload/attach your actual 3000 + prompt? Any chance it is json? @cebtenzzre this could be the hang we saw with that particular json as prompt...

brankoradovanovic-mcom commented 1 week ago

Here's a way to reproduce it:

  1. Pick any model that supports at least 8k context (e.g. openchat-3.6-8b-20240522-Q5_K_M.gguf, or any of the L3-8B derivatives)
  2. Make sure config says context length is set to 8192 (it is)
  3. Go to https://users.ece.cmu.edu/~gamvrosi/thelastq.html and C/P the entire text
  4. Notepad++ says it's 4803 words, i.e. definitely below the 8k token limit, so therefore expected to work. All prose, no markup.
  5. C/P the content into chat and press enter
  6. ...and it's stuck

Tested on two machines with multiple models, does not work for me, while I had no trouble with such prompts before.

(One slight correction: the CPU does not go to zero eventually, it drops to 100% of 1 core.)

brankoradovanovic-mcom commented 1 week ago

Edit: Did a test with a reduced context length (note: not 4k as in the name I gave it), and this is what you should be seing as a response: image

Now this is interesting... By chance, I had the Phi-3 Mini Instruct installed. I keep its context length set at 2048 (because it actually doesn't work right with 4k context). However, when I C/P the text from the URL I provided above, I don't get an error message like yours, although I should have gotten it. Again, I get chat that is stuck.

cosmic-snow commented 1 week ago

Can you try that again, but in the settings, change the device to CPU beforehand (I'm assuming it isn't at the moment)?

Although it's odd, because for me, even with the default of 2048 and picking any device setting, I'm still getting that error message immediately when pasting your text.

brankoradovanovic-mcom commented 1 week ago

The device was set to Auto (which defaulted to CPU). After setting it to CPU, I get the same behavior as before.

Here is how it looks like after copy-pasting and submitting the large prompt. Note the prompt is not displayed for some reason.

Screenshot 2024-07-03 094711

cosmic-snow commented 1 week ago

That's odd. I downloaded the model which is shown in that screenshot, or at least one with the same name.

I've run it with 2048 and I get the error message, as expected.

With a context length of 8192 it starts processing. Takes a while and then responds: image

manyoso commented 1 week ago

@brankoradovanovic-mcom The bug you're running into has nothing to do with context length. It is some bug related to how we handle markdown with certain prompt inputs. It causes the app to go into a recursive loop processing markdown.

Changing the title of bug to reflect.

manyoso commented 1 week ago

@brankoradovanovic-mcom thank you for reporting this. We've seen this on one other prompt, but now we have a different one as well. We don't think it is actually the length of the prompt, but something about the shape of the prompt that causes this.

Regardless, we have a plan to fix, but it didn't make it into this latest release. We'll be working to fix this in a near future release.