Closed meni3a closed 2 weeks ago
Sometimes small LLMs struggle to provide a proper JSON, especially when you ask just "give me a JSON" without further instructions. Try to provide in a request some examples:
What color is the sky at different times of the day? Respond using a JSON.
Response example:
{
"morning": color_a,
"day": color_b,
"evening": color_c
}
Read about some prompt techniques here https://www.promptingguide.ai/techniques/fewshot (just dropped a random link from the internet).
For single request tasks try to use ollama.generate
and modify the system
property, to try to instruct model behave in a more restricted way, with the temperature of 0.
@Pixelycia Thanks for responding. In this case, it doesn't seem the issue is with the model itself. Using the Ollama API, the model works as expected. for example:
curl http://localhost:11434/api/generate -d '{
"model": "llama3-gradient",
"prompt": "What color is the sky at different times of the day? Respond using JSON",
"format": "json",
"stream": false
}'
Result:
{
"model": "llama3-gradient",
"created_at": "2024-05-07T10:36:49.427165Z",
"response": "{ \"morning\": \"Light blue\",\n\"daytime\": \"Bright blue\",\n\"afternoon\": \"Golden yellow\",\n\"evening\": \"Orange\",\n\"night\": \"Dark purple\"} ",
"done": false
}
Only when I use the ollama-js I get the error I mentioned above: "Expected a Completed Response".
Yes, the problem is how LLM "learns" things, there is no rules for LLMs that "this JSON is valid" but "this is not" on the training stage, they are learning things unsupervised just because they saw some example, and more ofter model sees examples like previous one on the training stage - more it learn patterns, but it does not mean that it learns perfectly well, especially if we talking about models that has 8 billion parameters, comparative to ChatGPT 175 billion parameters, so it can hold ~21x less information in it's weights (very roughly speaking). If you do not set the temperature (i.e. randomness) to lower values and do not set the request that restricts output of your request - because of how probability works you can get random answer each time you execute model, and it could look like to yourself that in one cases it works perfectly fine, in others - not or in other words "unlucky" when used "ollama.js" you got an invalid JSON that could not be parsed (let's say it has extra curly bracket, but nobody taught model that this is incorrect, it learned by itself the best it can). So help the model to give a right answer - the easiest solution is to provide extra examples to your request. On the other hand you can try to use specialised models that were trained to solve specific tasks like this, and they could generate more reliable answer without extra tweak.
It's a hard topic, and it worth to know (at least basics) how LLMs work, what they can do, what they don't and how this "don't" turn into what you want, but do not create miracles, they are not perfect yet.
Just double checked, with a few-shot prompt like this:
What color is the sky at different times of the day? Respond using JSON format.
Response example:
{
date_time: color_a,
}
I got a valid JSON response with { "format": "json" }
:
{
"sunrise": "pinkish-orange",
"morning": "blue",
"afternoon": "slightly hazy blue",
"evening": "reddish-purple",
"night": "black"
}
Probably you got "Expected a Completed Response" because model could generate (or in opposite missed to generate) some "special tokens" inside a response, os it prevents ollama from proper request handling. As you see, with the prompt engineering we can solve this issue.
What else you can do - use default format and parse JSON manually after you got a response.
Another cool suggestion - try to avoid JSON
in prompt, because usually LLM good in python, try to ask: What color is the sky at different times of the day? Respond using python dictionary format.
, it usually performs MUCH better and it works perfectly fine with JSON parse function.
@Pixelycia
It's indeed possible to solve this with a little bit of code engineering. but I'm trying to take advantage of format: 'json'
option of Ollama.
every time I put format: 'json'
in my ollama-js project I get an error, and when I use it directly in the model through the CLI or the API - I get the response successfully.
it's the same exact model instance (running on my local machine) with the exact same prompt.
I tried it many times and got the same results. I don't think it is related to LLM not being consistent.. it looks like some an issue with ollama-js.
I cloned ollama-js and deleted line 92 in the browser.ts file:
throw new Error('Expected a completed response.')
And now it works perfectly.
it seems like there is an issue with the condition when using format: 'json'
.
@samwillis I would appreciate it if you could look at this part of the code:
src/browser.ts in processStreamableRequest
method line 92
if (!message.value.done && (message.value as any).status !== 'success') {
throw new Error('Expected a completed response.')
}
I see that message.value.done
always equals false
in json format.
Hey @meni3a @BruceMacD I have checked the above mentioned case and It works! and also the suggested line 92 doesn't have the code mentioned above
Tried to reproduce with the original code above - I don't think we are checking for this condition anymore - going to close this issue out unless someone is still running into it. Thanks @ManikantaMandala!
Code:
Error:
When I remove the
format: 'json'
everything works as expected.