Closed Iqra840 closed 2 years ago
Edit: It works for now as I set the parameter in the reshape method to be the length of the checkpoint and not the tokeniser, but may I still know why this might be happening?
Hi, this is due to the number of tokens in the loaded checkpoint being different to that of the model as defined in the pl_module. This is probably due to having smaller vocabulary (50268) when the pl_module is instantiated vs the larger vocab when the checkpoint was saved (50278). Sorry for the inconvenience, either having the model initialised with the same number of tokens as it had when the checkpoint was saved or what you suggest should work.
How did you solve it? (same issue)
When I try to run model_saving.py to save the model in a hf transformers format, I get the following error and am not sure how to resolve this. Is there an issue with my training, or is one of my packages incompatible? Thank you for your help!