Open andrewnguonly opened 6 months ago
Ollama server environment variables: OLLAMA_NUM_PARALLEL OLLAMA_MAX_LOADED_MODELS OLLAMA_MAX_QUEUE
OLLAMA_NUM_PARALLEL
OLLAMA_MAX_LOADED_MODELS
OLLAMA_MAX_QUEUE
Ollama release notes: https://github.com/ollama/ollama/releases/tag/v0.1.33
OLLAMA_MAX_LOADED_MODELS meaningfully improves user experience. I can't tell the difference with OLLAMA_NUM_PARALLEL and OLLAMA_MAX_QUEUE.
thanks for sharing! I've added these variables now
Ollama server environment variables:
OLLAMA_NUM_PARALLEL
OLLAMA_MAX_LOADED_MODELS
OLLAMA_MAX_QUEUE
Ollama release notes: https://github.com/ollama/ollama/releases/tag/v0.1.33