graykode / nlp-tutorial

Natural Language Processing Tutorial for Deep Learning Researchers
https://www.reddit.com/r/MachineLearning/comments/amfinl/project_nlptutoral_repository_who_is_studying/
MIT License
14.03k stars 3.9k forks source link

Problem with BERT batch generation #45

Open aqibsaeed opened 4 years ago

aqibsaeed commented 4 years ago

There is a problem with padding on line 73-75 . What if the sentence length is larger than maxlen? Then we end up with sequences of varying length and line 214 throws an error.

Soothysay commented 1 year ago

You cannot have sentences having length greater than maxlen. In cases where the length of a sentence is lesser than maxlen, the code applies padding.