Open KailashDN opened 3 years ago
The combined input length can just be 512 word pieces.
Encoding more is only possible with models that are trained for more than 512 word pieces, but they have than e.g. a limit at 1024 or 4096.
Or you use sliding window approach
What is the max_seq_length
that was used to train models that produced the results shown in the ce benchmark?
The max_seq_length
was indicated for sentence embeddings:
The following models have been tuned to embed sentences and short paragraphs up to a length of 128 word pieces.
@seahrh The Cross-Encoders where train with 512 word pieces: https://github.com/UKPLab/sentence-transformers/blob/master/examples/training/ms_marco/train_cross-encoder-v2.py
Hi, I am using a cross-encoder in question-answer re-ranking. In cross-encoder, we pass question and answer together. So both need to be in the token limit of 512 or we create embedding separately for Qn and Ans, having a token limit of 512 each?
My understanding, <[CLS] question [SEP] answer [PAD][PAD]....[SEP]> or my case <[CLS] question [SEP] topic title [SEP] answer [PAD][PAD]....[SEP]>
in both should be within 512? Is there is a way to have an encoding of more than 512 apart from the sliding window approach?
Thank you.