firstbatchxyz / dkn-compute-node

Compute Node of Dria Knowledge Network.
Apache License 2.0
8 stars 2 forks source link

Ollama-rs `KeepAlive` #14

Closed erhant closed 2 weeks ago

erhant commented 1 month ago

Nodes may continue to store the LLM in their memory, otherwise memory will be freed by Ollama-rs after 5 minutes of inactivity.

erhant commented 3 weeks ago

Turns out ollama serve itself accepts an OLLAMA_KEEP_ALIVE variable, defaulting to 5m which means 5 minutes. We should also take this from env.

erhant commented 2 weeks ago

Closed as if the node is active enough, this will not be a problem; if the model is active with multiple models we have to remove the model nevertheless.