valiantlynx / ollama-docker

Welcome to the Ollama Docker Compose Setup! This project simplifies the deployment of Ollama using Docker Compose, making it easy to run Ollama with all its dependencies in a containerized environment
https://ollama-docker.azurewebsites.net/
Other
455 stars 93 forks source link

Expose the `keep_alive` property as an environment variable #9

Closed muliyul closed 3 months ago

muliyul commented 4 months ago

Is your feature request related to a problem? Please describe. Long loading time if keep alive time is elapsed (default is 5min).

Describe the solution you'd like An environment variable to control the keep_alive property.

Describe alternatives you've considered Pass the variable as part of the request directly. Not all libraries support this. langchain4j doesn't (or I have been unable to find it).

Additional context Running llama3 on RTX 3060TI 8GB

valiantlynx commented 4 months ago

i like that idea. ill try doing that i think ollama has something like that i can expose but ill have to read more about it

muliyul commented 4 months ago

From the docs:

Alternatively, you can change the amount of time all models are loaded into memory by setting the OLLAMA_KEEP_ALIVE environment variable when starting the Ollama server. The OLLAMA_KEEP_ALIVE variable uses the same parameter types as the keep_alive parameter types mentioned above. Refer to section explaining how to configure the Ollama server to correctly set the environment variable.

valiantlynx commented 3 months ago

i put default to 24h. i dont know if its too long but simple to change: https://github.com/ollama/ollama/blob/main/docs/faq.md#how-do-i-keep-a-model-loaded-in-memory-or-make-it-unload-immediately