Closed Yusuke196 closed 2 months ago
So I initially thought this was an issue with our model head configs, but after actually inspecting the model state dict and poking around a little, I think this might be an issue with 🤗 Transformers versioning. See people getting similar issues here: https://github.com/huggingface/transformers/issues/24921
Can you try these two things:
1) Make sure your transformers version matches the requirements version. If you have 4.1.x, you could try rolling back to the earliest 4.1.
2) See if you can just delete the position_ids
keys from the model state dict. I looked at the value and it's just a range from 0 to the max position, so this may not need to be there and the model might just run without it.
I had transformers 4.41.2. I set transformers==4.1.0
(or transformers==4.1.*
) in requirements.txt and ran pip install -r requirements.txt
but it anyway failed due to an issue related to building wheel for tokenizers. I am trying to solve this, but not successful so far (I understand this is a problem on my side).
I successfully ran a model (mweaswsd-ft) like:
Writing output
Running scorer -------
P= 74.0%
R= 74.0%
F1= 74.0%
----------------------
Done
by adding the following to ContextDictionaryBiEncoder:
def on_load_checkpoint(self, checkpoint):
keys_to_delete = [
'context_encoder.embeddings.position_ids',
'definition_encoder.embeddings.position_ids',
]
for key in keys_to_delete:
if key in checkpoint['state_dict']:
del checkpoint['state_dict'][key]
The score seems to match that on the paper. Thank you!
Glad you were able to resolve the issue with #2! In that case I wouldn't worry about tweaking the transformers version; it's possible that there are other things that changed on HF's end (like maybe there's a config file somewhere that gets downloaded separate from the versions, I'm not sure).
In any case, seems like the issue is solved for now.
I downloaded https://huggingface.co/Jotanner/mweaswsd-ft and tried to run the model for evaluation by
and hit an error.
It seems that the model on huggingface has the keys "context_encoder.embeddings.position_ids" and "definition_encoder.embeddings.position_ids", while the model (checkpoint) produced by the current code doesn't. Can I easily fix this?
If not, I guess I should do all the tuning by myself. But if possible I'd like to run the model on huggingface to make sure that I'm using the same model as the paper.