Closed GabrielLin closed 2 years ago
Hi there,
Yes, it can accept 16k input. However, in the models on HF, as it only pretrained with max_length=4096, it does not have trained position embedding for the tokens after that. If you would like to use a larger max_length, you can follow the same method used in Longformer-Encoder-Decoder, i.e. simply copying the position embeddings four times, and fine-tune the model with the new position embeddings.
Dear @Wendy-Xiao , thank you for your answer. It solves my concern.
@GabrielLin Curious if you have built a model with max_length=16384. I'm summarising lots of documents at once, and frequently have ~10,000 tokens in total 🙂
I mean, if you have built trained such model, would be really cool if you could publish it in HuggingFace
Hi @attekei . Thank you for your interest. I am just doing research to compare different models.
Could you please tell me can the models on HF (https://huggingface.co/allenai/PRIMERA, https://huggingface.co/allenai/PRIMERA-arxiv) accept 16k input. Can I just set the max_length to 16384 to let it accept such a length of a long document? Thanks.