simonw / llm-gpt4all

Plugin for LLM adding support for the GPT4All collection of models
Apache License 2.0
213 stars 19 forks source link

Model output stops before the answer is complete #26

Open boldandbusted opened 6 months ago

boldandbusted commented 6 months ago

Howdy. I am running llm chat -m mistral-7b-instruct-v0 on my underpowered laptop, so I completely expect slow responses from most models. The model functions, but abruptly stops before completing long answers. I can continue the answer with Continue. as the prompt, but it often skips characters. Is there a setting, like a timeout, that I need to change to allow the model to complete? And, if so, can I set it to never timeout?

Apologies in advance if this is a question that is out of scope for llm itself. Thanks.

bibz commented 5 months ago

I think @simonw has created a duplicate (0:-)) for this with #31

Try to increase the number of max tokens with -o max_tokens SOME_NUMBER. According to the aforementioned bug, the default is 200 so try something bigger. I don't really know what the actual limit is, but I could use 2000 with Llama 3 8B Instruct.