alondmnt / joplin-plugin-jarvis

Joplin (note-taking) assistant running a very intelligent system (OpenAI/GPT, Hugging Face, Gemini, Llama, Universal Sentence Encoder, etc.)
GNU Affero General Public License v3.0
209 stars 22 forks source link

The maximum context length is over,but I just ask Jarvis a few question.Obviously the length is short, but it is wrong.Look at these pictures below. #3

Closed kawuwa closed 1 year ago

kawuwa commented 1 year ago

图片 图片

How to speak English? erro:~~~ after I click ok 图片

alondmnt commented 1 year ago

I'll try to explain the error, and suggest a workaround.

  1. When sending a request to GPT, the app has to estimate in advance the token length of the input, and add it to the desired length of the output (so that input length + output length = max tokens).

  2. This can either be done using model-specific tools (tokenizers), or roughly estimated using heuristics. I use the latter at the moment, because I ran into trouble using a tokenizer. Therefore, Jarvis' estimation is not perfect, and sometimes underestimates the true length of the input. This is especially true for shorter prompts, where the estimation error tends to be larger. In your example, Jarvis estimated the input length at 2 tokens, but in practice, on the OpenAI server, it turned out to be 23.

  3. The best solution I can offer for now is to set the max_tokens to something close to 4097 but a little lower, like 4000. When I do that, I rarely get these error messages.

  4. In any case, the error is automatically resolved by Jarvis, by decreasing the token estimation and re-sending the request.

  5. I think I will change the maximal value to be lower than 4096 (at least for short prompts), to try to avoid these problems in the future.

alondmnt commented 1 year ago

BTW, this is why the default value of max_tokens was set to 4000 since v0.3.0

kawuwa commented 1 year ago

Successfully! Thank you very much!😃