Closed pvcastro closed 3 years ago
Hi, the code part you are looking for is in "jerex/models/__init__.py". In the 'create_model' method we resize the embedding matrix of BERT and therefore train the position embeddings of sequences longer than 512 tokens from scratch.
Thanks!
Hi @markus-eberts ! Congratulations for your work!
I have a question... I was trying to debug your code, and didn't quite find the point in which text sequences longer than 512 tokens are being fed to bert / transformer model. How exactly are you handling this? Where in the code is this?
Thanks!