Can you provide a more detailed guide to fine-tuning operations?

asahi417 / lm-question-generation

Multilingual/multidomain question generation datasets, models, and python library for question generation.

https://www.autoqg.net

MIT License

313 stars 30 forks source link

Can you provide a more detailed guide to fine-tuning operations? #23

Closed chenzebiaohub closed 2 weeks ago

chenzebiaohub commented 4 months ago

I want to fine-tune a mt5 Chinese model on my own dataset！ Extremely grateful.

asahi417 commented 4 months ago

Hey, do you have your dataset on HuggingFace?

On Wed, 8 May 2024 at 13:37, 小陈 @.***> wrote:

@asahi417 https://github.com/asahi417

— Reply to this email directly, view it on GitHub https://github.com/asahi417/lm-question-generation/issues/23#issuecomment-2099725254, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEEXCDHOM3DPB3MD224BJZLZBGTYBAVCNFSM6AAAAABHLM45SSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJZG4ZDKMRVGQ . You are receiving this because you were mentioned.Message ID: @.***>

chenzebiaohub commented 4 months ago

hello! thanks your reply! I have the dataset organized ~ not uploaded to huggingface at the moment, would like to load the dataset locally for mT5 fine-tuning

asahi417 commented 4 months ago

Cool. Are you gonna load the dataset locally via HuggingFace dataset? If it uses the HuggingFace dataset, it should be pretty straightforward, but otherwise can be a bit tricky.

On Wed, 8 May 2024 at 14:08, 小陈 @.***> wrote:

hello! thanks your reply! I have the dataset organized ~ not uploaded to huggingface at the moment, would like to load the dataset locally for mT5 fine-tuning

— Reply to this email directly, view it on GitHub https://github.com/asahi417/lm-question-generation/issues/23#issuecomment-2099750423, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEEXCDAQRSUHA4UC5YXDZ4TZBGXMZAVCNFSM6AAAAABHLM45SSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJZG42TANBSGM . You are receiving this because you were mentioned.Message ID: @.***>

chenzebiaohub commented 4 months ago

Very excited to see your feedback! I will upload the dataset to huggingface, how do I need to fine tune it after that?

chenzebiaohub commented 4 months ago

@asahi417

asahi417 commented 4 months ago

Try the following command.

lmqg-train-search -c "tmp" -d "{your-hf-dataset-alias}" -m "mt5-small" -b 64 --epoch-partial 5 -e 15 --language "zh" --n-max-config 1 -g 2 4 --lr 1e-04 5e-04 1e-03 --label-smoothing 0 0.15

That will launch the fine-tuning with hyperparameter gridsearch. you might need to play around with the parameter. See the full description at lmqg-train-search -h.

asahi417 commented 4 months ago

For example, following command is used to train https://huggingface.co/lmqg/mt5-base-zhquad-qag.

LA='zh'
MODEL="google/mt5-base"
MODEL_SHORT='mt5-base'
lmqg-train-search --use-auth-token -d "lmqg/qag_${LA}quad" -m "${MODEL}" -b 8 -g 8 16 -c "lmqg_output/${MODEL_SHORT}-${LA}quad-qag" -i 'paragraph' -o 'questions_answers' --n-max-config 2 --epoch-partial 5 -e 15 --max-length-output-eval 256 --max-length-output 256 --lr 1e-04 5e-04 1e-03

asahi417 commented 4 months ago

See more example here

https://github.com/asahi417/lm-question-generation/blob/master/misc/2023_acl_qag/model_finetuning.end2end.sh

chenzebiaohub commented 4 months ago

Very excited to see your feedback! THANKS