Open boldandbusted opened 7 months ago
I think @simonw has created a duplicate (0:-)) for this with #31
Try to increase the number of max tokens with -o max_tokens SOME_NUMBER
. According to the aforementioned bug, the default is 200 so try something bigger. I don't really know what the actual limit is, but I could use 2000 with Llama 3 8B Instruct.
Howdy. I am running
llm chat -m mistral-7b-instruct-v0
on my underpowered laptop, so I completely expect slow responses from most models. The model functions, but abruptly stops before completing long answers. I can continue the answer withContinue.
as the prompt, but it often skips characters. Is there a setting, like a timeout, that I need to change to allow the model to complete? And, if so, can I set it to never timeout?Apologies in advance if this is a question that is out of scope for llm itself. Thanks.