huawei-noah / Pretrained-Language-Model

Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
3.03k stars 627 forks source link

TinyBert fine tune SQuAD #138

Open shairoz-deci opened 3 years ago

shairoz-deci commented 3 years ago

Thank you for sharing this great repo. Can you please provide instructions, or code if available, for task distillation on the SQuAD dataset?

Thanks in advance

zwjyyc commented 3 years ago

Hi, we have no plans to release the distillation code for SQuAD dataset. For SQuAD fine-tune code, you can refer to https://github.com/huawei-noah/Pretrained-Language-Model/blob/master/AutoTinyBERT/superbert_run_en_classifier.py.

shairoz-deci commented 3 years ago

thank you for your reply @zwjyyc I've implemented the squad training based on the example you mentioned, can perhaps verify that the training of squad was done similarly to the glue tasks? meaning that the data was multiplied by 20 using augmentation and ran for 10 epochs without pred_distill and 3 more epochs with pred_distil? By doing the above the training time seems very long, almost as long as the general distillation.

Thanks