Closed thomas-happify closed 3 years ago
Hi @thomas-happify, there are some unused tokens (id: 250002-250036) in mMiniLM's vocab and the 0-250001 tokens are the same as XLMR. There are two methods to fix the issuse. 1) You could refer to our fine-tuning example code on XNLI. The example code is not based on the AutoModel in Transformers. You may need to modify your code. 2) You could remove the unused embeddings (id: 250002-250036) in mMiniLM checkpoint to load the model.
Thanks
@WenhuiWang0824 Thanks a lot! Do you mind explaining why exactly mMiniLM had extra tokens? I just want to understand thoroughly.
Thanks!
Describe the bug Model I am using (UniLM, MiniLM, LayoutLM ...): mMiniLM
The problem arises when using:
A clear and concise description of what the bug is. mMiniLM embedding layer and tokenizer has different size.
To Reproduce Steps to reproduce the behavior:
Expected behavior A clear and concise description of what you expected to happen. shouldn't embedding vocab_size equal to tokenizer size?