vietai / ViT5

MIT License
60 stars 9 forks source link

Fine tune vit5-base model for text summarization #6

Closed MinhDang685 closed 2 years ago

MinhDang685 commented 2 years ago

Hello VietAI team,

Thanks for sharing the pretrained models in your research paper. I am interested on fine tuning the VietAI/vit5-base language model for the abstractive summarization task. I have some questions:

  1. When I run your example here, unlike @r1ckC139 in #1 that he has random sequences, I always got a fixed-length (=max_length) unchanged array, I have try to modify the input (with "vi: " / "vietnews: " prefix, and without prefix) but the result is not changed. Could you take a look? image
  2. In the fine tuning phase, do I need to preprocess the data by adding a prefix?

Thanks a lot

justinphan3110 commented 2 years ago

Hi @MinhDang685 , for MLM pretraining we used mesh-tensorflow. The models on HuggingFace are ready for finetuning only.

You don't need to add prefix in finetunning.

MinhDang685 commented 2 years ago

Hi @justinphan3110, thanks for your quick reply.

Thanks

justinphan3110 commented 2 years ago

@MinhDang685

MinhDang685 commented 2 years ago

Hi @justinphan3110, thanks for your help, I try to generate with the model again and it works now, the output sequences now changes base on the input

I notice that you have updated the model config.json file by removing task specific prefixes, is it the cause of the issue (that I miss the "summarization" prefix before the input to indicate I want the model to perform summarization task)?

justinphan3110 commented 2 years ago

@MinhDang685 , You need prefix vietnews: for VietAI/vit5-large-vietnews-summarization . For VietAI/vit5-base-vietnews-summarization you don't need any prefix.

You can have a look over the eval scripts with HuggingFace

MinhDang685 commented 2 years ago

thank you @justinphan3110 for pointing that out