Closed quocthang0507 closed 3 years ago
Hi, thanks a lot for the contribution, but I may not merge this because in run_train.py
, it is necessary to have separate collate functions for training and testing, thus for consistency it might be better to have collate() inside word_align() here as well.
Move collate() out of word_align() and change tokenizer.pad_token_id to modeling.PAD_ID Additionally, I reformat code style by default