⭐️ Feat: consider using Ollama's new unload / stop command for unloading

Description

consider using Ollama's new unload / stop command for unloading

their docs:

List which models are currently loaded

ollama ps

Stop a model which is currently running

ollama stop llama3.1

Unload a model

If an empty prompt is provided and the keep_alive parameter is set to 0, a model will be unloaded from memory.

Request

curl http://localhost:11434/api/generate -d '{
  "model": "llama3.1",
  "keep_alive": 0
}'

Response

A single JSON object is returned:

{
  "model": "llama3.1",
  "created_at": "2024-09-12T03:54:03.516566Z",
  "response": "",
  "done": true,
  "done_reason": "unload"
}

Unload a model

If the messages array is empty and the keep_alive parameter is set to 0, a model will be unloaded from memory.

Request

curl http://localhost:11434/api/chat -d '{
  "model": "llama3.1",
  "messages": [],
  "keep_alive": 0
}'

Response

A single JSON object is returned:

{
  "model": "llama3.1",
  "created_at":"2024-09-12T21:33:17.547535Z",
  "message": {
    "role": "assistant",
    "content": ""
  },
  "done_reason": "unload",
  "done": true
}

sammcj / gollama

⭐️ Feat: consider using Ollama's new unload / stop command for unloading #115

Description

List which models are currently loaded

Stop a model which is currently running

Unload a model

Request

Response

Unload a model

Request

Response