Dear Authors, there is a bug in the token type ids of the BERT tokenizer as it is adding an extra token which leads to a mismatch in dimensions between input_ids and token_type_ids. I can see that you guys haven't used token_type_ids for pretraining/finetuning so, this bug might not have shown up. Kindly fix this issue.
Here is a minimal code implementation to understand this issue.
Dear Authors, there is a bug in the token type ids of the BERT tokenizer as it is adding an extra token which leads to a mismatch in dimensions between input_ids and token_type_ids. I can see that you guys haven't used token_type_ids for pretraining/finetuning so, this bug might not have shown up. Kindly fix this issue.
Here is a minimal code implementation to understand this issue.