Open ntimo opened 10 months ago
Support for all open source llms
support works for local llm environments. the issue i'm seeing now, is it appears the Auto-Gen function in grafana is timing out before the local LLM agent has a chance to respond.
My configuration, use OpenAI endpoint, URL will be your local LLM agent, ollama server, or other, api key = NULL and it does checks and responds ok. You can see this on your server end, it'll send a Hello, then wait for a reply. Im chasing down the idea of enabling vGPU in order to speed up the VM running ollama.
Hey, thanks for the feature request. Enabling open source LLMs is something we're interested in.
We're toying with supporting a service like vllm that exposes an OpenAI-like API, that Grafana could then use. There are subtle incompatibilities between GPT3/4 and something like Mistral7B-instruct however: often the role names in prompts are different, and the prompts may need to be engineered differently for best results. We're not sure how best to proceed yet. Thoughts welcome!
Im thinking at least increasing the timeout. Grafana errors out before a local llm has time to respond. Of course we could increase performance on the local llm host but may not he the case for everyone.
For the different models, maybe having an edit prompt fields so a user can define the instructions to be sent along with the other parameters, tokens, temperature, etc..
I think from the current llm plugin which supports OPENAI and AzureAI it misses one thing in order to work with OPENAI Compatible llms. If you add another option somehow to fill in your model name this will make it compatible with ollama as well.
By supporting Ollama it would be possible to use locally hosted LLMs which, would be quite privacy friendly. I think this would pair quite nicely with Grafana’s mission.
https://github.com/jmorganca/ollama