Closed stas00 closed 4 years ago
Yeah I think the clean solution is authorized_extra_keys
but I could also just reconvert the models.
We could also leave the warning.
What do you think @sgugger ?
IMHO, that warning makes the library look somewhat amateurish, as it makes the user wonder whether something is wrong, for absolutely no reason.
As I'm the one who is bothered - If I can be of help resolving this please don't hesitate to delegate this to me.
The cleanest would be to reconvert the models and remove the keys we don't need, I think. Adding the authorized_extra_keys
works too, but then using it too much could have unexpected consequences resulting in bugs, so I'd only go down that road if there is no other option.
The simplest and cleanest way would probably to simply remove these two variables from the state dict, wouldn't it? If reconverting the checkpoint you should check that it is exactly the same as the previous one, which sounds like more of a pain and more error prone than simply doing
!wget https://cdn.huggingface.co/facebook/bart-large/pytorch_model.bin
weights = torch.load('/path/to/pytorch_model.bin')
del weights['encoder.version']
del weights['decoder.version']
torch.save(weights, 'new_pytorch_model.bin')
Done. Also converted weights to fp16.
Using an example from the bart doc: https://huggingface.co/transformers/model_doc/bart.html#bartforconditionalgeneration
gives:
well, there is one more issue of using a weird deprecated
nonzero()
invocation, which has to do with some strange undocumented requirement to pass theas_tuple
arg, since pytorch 1.5 .https://github.com/pytorch/pytorch/issues/43425we have
authorized_missing_keys
:authorized_missing_keys = [r"final_logits_bias", r"encoder\.version", r"decoder\.version"]
https://github.com/huggingface/transformers/blob/master/src/transformers/modeling_bart.py#L942 which correctly updatesmissing_keys
- should there be also anauthorized_unexpected_keys
which would clean upunexpected_keys
?(note: I re-edited this issue once I understood it better to save reader's time, the history is there if someone needs it)
And found another variety of it: for
['model.encoder.version', 'model.decoder.version']