Open carlosrokk3r opened 3 years ago
I have never encountered this issue. After resize_token_embeddings()
, the trained model weights will be loaded with load_state
which loads the trained embeddings, so there is no reason for them to change every load.
Hi, I just trained my model locally and checked the results of my trained models against the ones on the
README
. I found that they are different. I believe this is due to the embeddings of the previously mentioned tokens change every time the model is instantiated. For instance, trying with the same phrase, if I instantiated the model and predicted, the output would be different from the next time I instantiated and predicted the same phrase.I believe in the
__init__
method of theinfer_from_trained
class, with the methodresize_token_embeddings()
at line 83 of theinfer.py
file, the embeddings are being extended to have the 4 extra tokens, but the embeddings are being initialized randomly and this causes the results to vary.Am I understanding it correctly? Or am I mistaken? Any help would be appreciated.