Can PRIMERA accept 16k input?

allenai / PRIMER

The official code for PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

Apache License 2.0

150 stars 31 forks source link

Can PRIMERA accept 16k input? #17

Closed GabrielLin closed 2 years ago

GabrielLin commented 2 years ago

Could you please tell me can the models on HF (https://huggingface.co/allenai/PRIMERA, https://huggingface.co/allenai/PRIMERA-arxiv) accept 16k input. Can I just set the max_length to 16384 to let it accept such a length of a long document? Thanks.

Wendy-Xiao commented 2 years ago

Hi there,

Yes, it can accept 16k input. However, in the models on HF, as it only pretrained with max_length=4096, it does not have trained position embedding for the tokens after that. If you would like to use a larger max_length, you can follow the same method used in Longformer-Encoder-Decoder, i.e. simply copying the position embeddings four times, and fine-tune the model with the new position embeddings.

GabrielLin commented 2 years ago

Dear @Wendy-Xiao , thank you for your answer. It solves my concern.

attekei commented 1 year ago

@GabrielLin Curious if you have built a model with max_length=16384. I'm summarising lots of documents at once, and frequently have ~10,000 tokens in total 🙂

I mean, if you have built trained such model, would be really cool if you could publish it in HuggingFace

GabrielLin commented 1 year ago

Hi @attekei . Thank you for your interest. I am just doing research to compare different models.