Open omerarshad opened 2 years ago
Why is "max_position_embeddings" different in sbert then in Bert?
Can you point to the respective code?
Some models allow 514 tokens to account for special tokens
Why is "max_position_embeddings" different in sbert then in Bert?