kaixxx / QualCoder

Qualitative data analysis for text, images, audio, video. Cross platform. Python 3.8 or newer and PyQt6.
https://qualcoder.wordpress.com/
MIT License
5 stars 2 forks source link

Use with local LLMs for better privacy and differentiated AI assistance #3

Open menelic opened 8 months ago

menelic commented 8 months ago

Is your feature request related to a problem? Please describe.
QC AI is a great coding assistance, in my testing it works like an indefatigable and curious coding assistance, who needs detailed instructions.

However, Open AI is not a trustworthy arbiter of confidential data or personal data that respondents have entrusted a researcher with. Local AI models are fast catching up to commercial and closed source options, and their obvious privancy benefits (data does not leave the computer on which it is analysed) is particularly relevant for qualitative research.

Describe the solution you'd like
please enable and document use with local AI models

There are several options for running models locally and exposing a drop in API replacement for OpenAI. Small tweaks in QualCoder AI could make it easy for researchers to utilise tools such as llamafile, jan.ai (both free and open) or LM Studio (free).

Additional context
Another benefit of a localised solution is that researchers would have both greater choice of models, incl some with domain knowledge, and could enhance the skills even of basic models either by fine tuning them or through retrieval augmented generation, where a local LM could be pointed to previously coded text to code based on a schema, or pointed to methodological literature to follow etc

kaixxx commented 1 week ago

I totally agree that becoming more independent of OpenAI would be very desirable, for all the reasons you mentioned. As a first step, I have now added the ability to use other models, especially the open-source models hosted on the "Blablador" platform, provided by the German academic research agency Helmholtz Society. See https://github.com/kaixxx/QualCoder/blob/ai_integration_rework/README.md for more info. My experience was, however, that doing qualitative analysis is really pushing these small open-source models to their limits (Mixtral 8x7b is the largest on Blablador atm).

If you still want to try out local models, my mechanism should also work with them as long as they provide an OpenAI-compatible API (which has become a de facto standard over the last year). Ollama can export such an OpenAI compatible API endpoint: https://ollama.com/blog/openai-compatibility In order to use it, you have to edit the config.ini in ~user/.qualcoder_ai. Here you will find several AI models listed at the end. Just copy one of the entries and adjust it as needed. Make sure QualCoder is closed while editing this file.

Example config:

[ai_model_YOUR_MODEL_NAME]
desc = YOUR DESCRIPTION
access_info_url = [CAN BE EMPTY]
large_model = THE NAME OF THE MODEL TO USE
large_model_context_window = 32768
fast_model = CAN BE THE SAME NAME AS ABOVE (NOT USED ATM)
fast_model_context_window = 32768
api_base = YOU LOCAL ENDPOINT, E.G. "http://localhost:11434"
api_key = None

After restarting QualCoder, the newly added model should be available in the settings dialog.