Closed zedmango closed 5 months ago
The model is llama3-70b-instructQ4KM
There seems to be an issue with Ollama's /generate
REST endpoint and Llama3 models.
I've tried it on different envs (Linux, Windows, GPU, no GPU), and they are all failing.
As even curl
requests fail, I advise you to post this to Ollama's issue tracker and am closing it here for now.
but people seem to be getting llama3 to work using other front ends... how could that be?
but people seem to be getting llama3 to work using other front ends... how could that be?
A lot of other frontends use the /chat
endpoint (which works fine for me, if I use "Open webui")
Generate works for me with curl:
$ curl http://localhost:11434/api/generate -d '{
"model": "llama3-70b-instructQ4KM:latest",
"prompt": "Why is the sky blue?"
}'
{"model":"llama3-70b-instructQ4KM:latest","created_at":"2024-04-19T18:17:23.493146Z","response":"Here","done":false}
{"model":"llama3-70b-instructQ4KM:latest","created_at":"2024-04-19T18:17:26.1218179Z","response":" is","done":false}
{"model":"llama3-70b-instructQ4KM:latest","created_at":"2024-04-19T18:17:28.7380225Z","response":" the","done":false}
{"model":"llama3-70b-instructQ4KM:latest","created_at":"2024-04-19T18:17:31.5457876Z","response":" unc","done":false}
{"model":"llama3-70b-instructQ4KM:latest","created_at":"2024-04-19T18:17:34.298514Z","response":"ensored","done":false}
{"model":"llama3-70b-instructQ4KM:latest","created_at":"2024-04-19T18:17:36.828654Z","response":" and","done":false}
{"model":"llama3-70b-instructQ4KM:latest","created_at":"2024-04-19T18:17:39.4424131Z","response":" complete","done":false}
{"model":"llama3-70b-instructQ4KM:latest","created_at":"2024-04-19T18:17:42.0443599Z","response":" answer","done":false}
{"model":"llama3-70b-instructQ4KM:latest","created_at":"2024-04-19T18:17:44.6436538Z","response":":\r\n\r\n","done":false}
{"model":"llama3-70b-instructQ4KM:latest","created_at":"2024-04-19T18:17:47.3470744Z","response":"The","done":false}
{"model":"llama3-70b-instructQ4KM:latest","created_at":"2024-04-19T18:17:49.9974092Z","response":" sky","done":false}
{"model":"llama3-70b-instructQ4KM:latest","created_at":"2024-04-19T18:17:52.7423936Z","response":" appears","done":false}
{"model":"llama3-70b-instructQ4KM:latest","created_at":"2024-04-19T18:17:55.2277731Z","response":" blue","done":false}
{"model":"llama3-70b-instructQ4KM:latest","created_at":"2024-04-19T18:17:57.7483048Z","response":" because","done":false}
{"model":"llama3-70b-instructQ4KM:latest","created_at":"2024-04-19T18:18:00.4720651Z","response":" of","done":false}
{"model":"llama3-70b-instructQ4KM:latest","created_at":"2024-04-19T18:18:03.0125387Z","response":" a","done":false}
{"model":"llama3-70b-instructQ4KM:latest","created_at":"2024-04-19T18:18:05.9710912Z","response":" phenomenon","done":false}
{"model":"llama3-70b-instructQ4KM:latest","created_at":"2024-04-19T18:18:08.6087881Z","response":" called","done":false}
{"model":"llama3-70b-instructQ4KM:latest","created_at":"2024-04-19T18:18:11.1774928Z","response":" Ray","done":false}
{"model":"llama3-70b-instructQ4KM:latest","created_at":"2024-04-19T18:18:13.8879123Z","response":"leigh","done":false}
{"model":"llama3-70b-instructQ4KM:latest","created_at":"2024-04-19T18:18:16.3995868Z","response":" scattering","done":false}```
server.log
shows the following:{"function":"validate_model_chat_template","level":"ERR","line":437,"msg":"The chat template comes with this model is not yet supported, falling back to chatml. This may cause the model to output suboptimal responses","tid":"25744","timestamp":1713547085}
This is with a template I created in the Modelfile. Why is it telling me this?