rilldata / rill

Rill is a tool for effortlessly transforming data sets into powerful, opinionated dashboards using SQL. BI-as-code.
https://www.rilldata.com
Apache License 2.0
1.71k stars 115 forks source link

Ability to use local LLMs for privacy. #4235

Open dstruck opened 8 months ago

dstruck commented 8 months ago

If the use of OpenAI is not possible due to privacy concerns, it would be beneficial to have the option to utilize a locally installed language model.

One possible solution is to allow for customization of the OpenAI URL and chosen model, making it feasible to utilize alternatives such as Ollama that can mimic the OpenAI API.

Alternatively, employing the Ollama API directly could offer access to its complete capabilities.

mindspank commented 8 months ago

Agree, that could absolutely be useful.

One thing to note in regards to privacy is that no actual data is being sent over the wire to either rill or OpenAI, only metadata such as data types and column names (realising column names can be sensitive).

dstruck commented 8 months ago

I guessed that Rill would only send the metadata, e.g. the schema to OpenAI ;-) However also metadata leaks sensitive information, like what kind of systems are deployed and in the case of custom systems what kind of data is stored.

Moreover, it could be interesting to explore the possibility of utilizing various language models like Mixtral, Gemini or Llama to compare and identify which one excels in creating an initial dashboard. There exists for example a LLM model specifically tuned to generate SQL: https://huggingface.co/defog/sqlcoder (available in Ollama).

mindspank commented 4 months ago

@nishantmonu51 Something to discuss for us when you are back. I would guess we need a hard switch on project level to ensure runtimes start with correct model url and api keys.