MeetKai / functionary

Chat language model that can use tools and interpret the results
MIT License
1.41k stars 108 forks source link

Finetuning #258

Open sjay8 opened 2 months ago

sjay8 commented 2 months ago

Hi! I'm a beginner to all of this. Can someone direct me how to finetune the v3 model? I saw #99 on how to structure the dataset https://github.com/MeetKai/functionary/blob/main/tests/test_case_v2.json

but not sure how exactly to begin the process of fine-tuning. Do I have to runs the scripts located here: https://github.com/MeetKai/functionary/tree/main/functionary/train?

khai-meetkai commented 2 months ago

Hi @sjay8, yes you can follow the readme in https://github.com/MeetKai/functionary/tree/main/functionary/train, although it has been frequently updated. You might need to upgrade to newest accelerate version: pip install --upgrade accelerate And in your training command, to use v3 model, please pass: --prompt_template_version VERSION The VERSION can be:

v3-llama3.1
v3.llama3
v2.llama3

... But I recommend using v3.llama3or v3-llama3.1 Another thing is you can use --packing to pack training data points if you have a lot of training data. If your data contains mostly short examples, the packing ratio might be very big, for example from 20k data points --> 2k data points, this will result in small number of training steps --> the model parameters will not be well updated. You can also set: --max_packed_size to control the number of data points after packing. If max_packed_size=4, this mean from 20k original data points, the number of data points after packing is not smaller than 20k/4.

From our experience, the number of training steps should be >= 500, the learning rate is ~ 5e-6 for 70B models and 8e-6 --> 1e-5 for 7B models