allenai / PRIMER

The official code for PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization
Apache License 2.0
150 stars 31 forks source link

pre-training PRIMERA #13

Open haisonle001 opened 2 years ago

haisonle001 commented 2 years ago

Im trying to pretrain primera on processed NewsHead dataset. Can you help me with a little more detail to implement it?

theanhle commented 2 years ago

Hỏi chung chung thế này không ai trả lời đâu :)

JohnGiorgi commented 2 years ago

Im trying to pretrain primera on processed NewsHead dataset. Can you help me with a little more detail to implement it?

I modified the run_summarization.py script from HF Transformers so it works with PRIMERA. You can check it out here: https://github.com/allenai/PRIMER/issues/6#issuecomment-1191718124 (note I am not one of the PRIMERA authors)