Closed srid closed 1 month ago
We needn’t have to assign DEVICE_TYPE = “cpu”
as it is ”cpu”
by default, unless explicitly specified.
Also, this ENV doesn’t affect ollama
using CPU or not, which will still be managed as documented., but it only affects how the embedding models used to run RAG pipelines are invoked.
ENABLE_OLLAMA_API = "True”;
is also redundant as it is true
by default. It could probably be a comment so that the users know how to disable it, if they want to.
OLLAMA_BASE_URL = "http://${host}:${toString port}”;
is also redundant as it is derived from OLLAMA_API_BASE_URL
by default.
{
RAG_EMBEDDING_ENGINE = "ollama";
RAG_EMBEDDING_MODEL = "mxbai-embed-large:latest";
}
should be fine, since otherwise Open WebUI will use sentence-transformers to fetch the embedding models, which would require DEVICE_TYPE
to choose where the embedding happens. If we rely on ollama
instead, we can make use of already documented configuration to use GPU acceleration.
should be fine, since otherwise Open WebUI will use sentence-transformers to fetch the embedding models, which would require
DEVICE_TYPE
to choose where the embedding happens. If we rely onollama
instead, we can make use of already documented configuration to use GPU acceleration.
This, verbatim, sounds like a good candidate for a comment on top of these env vars.
The rest can be either commented out or removed.
https://github.com/juspay/services-flake/blob/f360c8f2bc7e23c858dcb4eb9f597e8a91bba6d2/example/llm/flake.nix#L48-L58
Keep only environment variables (which were introduced in #227) that are strictly necessary, while leaving the rest commented out.
Consider the implications of
DEVICE_TYPE = "cpu";
especially when GPU is enabled.Our examples should a) "just work", be b) simple and minimal, c) well-documented (liberal use of comments, for example).