TsinghuaDatabaseGroup / DB-GPT

An LLM Based Diagnosis System (https://arxiv.org/pdf/2312.01454.pdf)
http://dbgpt.dbmind.cn/
Apache License 2.0
565 stars 80 forks source link

Support for local llm #57

Closed JINO-ROHIT closed 11 months ago

JINO-ROHIT commented 11 months ago

Does this work only with gpt4? Is there support for local models like mistral?

please help

thanks

zhouxh19 commented 11 months ago

Yes, we support localized llm :)

For example, You can use the llama2 model (renaming ./multiagents/agent_conf/config_diag_llama.yaml to ./multiagents/agent_conf/config.yaml). https://github.com/TsinghuaDatabaseGroup/DB-GPT/tree/main/diagllama#-quickstart

JINO-ROHIT commented 11 months ago

That is amazing! Is there also a sample of data used for finetuning the llm and how it was done? i was not able to find it anywhere Thanks so much

zhouxh19 commented 11 months ago

The finetuning data is still under preparation and not publicly available.

We will release it when the quality is good enough.

JINO-ROHIT commented 11 months ago

thanks ! i hope you guys release it sooner :)

JINO-ROHIT commented 11 months ago

can i also know how the finetuning was done? Is the process described somewhere?

curtis-sun commented 11 months ago

During our diagnosis using GPT-4, we actually decompose the complex diagnosis task into tens of simpler sub-tasks. We thus collect responses of GPT-4 to those sub-tasks, and fine-tune local LLMs with them by supervised learning. We'll try to optimize our fine-tuning procedure in the future.

JINO-ROHIT commented 11 months ago

nice, and this is instruction finetuned based supervised learning correct?

JINO-ROHIT commented 11 months ago

also how can i contribute to the project?

zhouxh19 commented 11 months ago

We will be thrilled if you can contribute to the project together!

Of course the first step is to get the project to successfully run on your computer. Next, if you find any problem or missing functions, you can inform us or directly submit a GitHub pr. We are open to contributions in both academic research and real case applications.

zhouxh19 commented 11 months ago

We have released the training data :) https://github.com/TsinghuaDatabaseGroup/DB-GPT/tree/main/diagllama/training_data