Cannot Reproduce Fine-Tuning

premAI-io / premsql

End-to-End Local-First Text-to-SQL Pipelines

128 stars 6 forks source link

If I remember (I need to check though) Batch size was specified to 4, learning rate was set to 1e-5 and we used some synthetic dataset in order to fine-tune the models. I will also give you one observation which I saw. These small models are not very much generalizable. PremSQL-1B was very much focussed on BirdBench, what we tried was generated some synthetic samples which was similar to BirdBench training data. Training with those gave a huge leap in the results.

As of now, the scripts for fine-tuning in PremSQL, might be bit buggy, and I am working on it. However the main sauce was the different datasets we used and also continual fine-tuning.

premAI-io / premsql

Cannot Reproduce Fine-Tuning #30