Open sammcj opened 1 month ago
consider using Ollama's new unload / stop command for unloading
their docs:
ollama ps
ollama stop llama3.1
If an empty prompt is provided and the keep_alive parameter is set to 0, a model will be unloaded from memory.
keep_alive
0
curl http://localhost:11434/api/generate -d '{ "model": "llama3.1", "keep_alive": 0 }'
A single JSON object is returned:
{ "model": "llama3.1", "created_at": "2024-09-12T03:54:03.516566Z", "response": "", "done": true, "done_reason": "unload" }
If the messages array is empty and the keep_alive parameter is set to 0, a model will be unloaded from memory.
curl http://localhost:11434/api/chat -d '{ "model": "llama3.1", "messages": [], "keep_alive": 0 }'
{ "model": "llama3.1", "created_at":"2024-09-12T21:33:17.547535Z", "message": { "role": "assistant", "content": "" }, "done_reason": "unload", "done": true }
Description
consider using Ollama's new unload / stop command for unloading
their docs:
List which models are currently loaded
Stop a model which is currently running
Unload a model
If an empty prompt is provided and the
keep_alive
parameter is set to0
, a model will be unloaded from memory.Request
Response
A single JSON object is returned:
Unload a model
If the messages array is empty and the
keep_alive
parameter is set to0
, a model will be unloaded from memory.Request
Response
A single JSON object is returned: