Dataherald / dataherald

Interact with your SQL database, Natural Language to SQL using LLMs
https://dataherald.readthedocs.io/en/latest/
Apache License 2.0
3.2k stars 216 forks source link

Support finetuning open-source LLMs #439

Open MohammadrezaPourreza opened 3 months ago

MohammadrezaPourreza commented 3 months ago

Hi everyone,

Our engine currently supports fine-tuning OpenAI models. However, we've received considerable feedback from our community expressing a desire to integrate their own fine-tuned models into our pipeline, as an alternative to the default OpenAI models. We welcome and greatly appreciate any contributions toward implementing this feature. Here are the steps outlining the necessary modifications:

1) Introduction of a New Endpoint: We need a new endpoint that serves the fine-tuning dataset. This endpoint should deliver a JSONL file generated by the create_finetuning_dataset() function located within our finetuning directory. The provided file will be instrumental for users looking to fine-tune a model.

2) Modification of SQL Generation Endpoints: It's crucial to update all SQL generation endpoints to accept two new parameters: base_url and model_name. These adjustments will enable the endpoints to interface with the user's fine-tuned model that has been deployed.

Your involvement in this project would significantly contribute to enhancing its flexibility and usability. We look forward to any support or input you can provide on this matter.

Thank you!

muratulashozturk commented 1 month ago

Hello,

I am implementing Together AI as an alternative to OpenAI. It may not be like open source LLM but at least it will support to use them with a budget friendly option.

Kind Regards. Murat Ulaş Öztürk