vietai / ViT5

MIT License
59 stars 9 forks source link

Hỏi về đánh giá checkpoint VietAI/vit5-large-vietnews-summarization trên dataset Vietnews? #4

Closed BinhMinhs10 closed 2 years ago

BinhMinhs10 commented 2 years ago

Screenshot from 2022-08-19 07-48-36

mình thử evaluate trên dataset vietnew với code run_summarization của huggingface (đã set sourse_prefix "vietnews: " ) nhưng không hiểu sao rouge2, rougeL,.. rất thấp

justinphan3110 commented 2 years ago

Hi @BinhMinhs10, can you share the scripts to reproduce this?

BinhMinhs10 commented 2 years ago

https://github.com/huggingface/transformers/blob/main/examples/pytorch/summarization/run_summarization.py Mình dùng script này bỏ mỗi đoạn nltk chỗ hàm postprocess_text

justinphan3110 commented 2 years ago

Hi @BinhMinhs10 , sorry for the late reply. We have just updated our eval scripts here https://github.com/vietai/ViT5/blob/main/eval/Eval_vietnews_sum.ipynb

Please note that we fine-tuned the task with a vietnews: prefix. We are working on an updated version without this prefix. For now you need to prepend a vietnews: prefix in the input sequence.

BinhMinhs10 commented 2 years ago

Great, thanks to @justinphan3110 for pointing this out and suggesting a solution 🚀