Closed nangying1112 closed 2 years ago
I'll see if I can fix the warning but you should not worry because after we load the XLMR from hugging face we reload the final weights from our final checkpoints.
To be fair this is a bit inefficient as we are loading XLMR twice, but what counts in the end are the weights from the checkpoint downloaded from the list of available models.
Thanks for the reply! I want to ask one more question. When evaluating on English->Chinese, should the source/hypothesis/reference file be tokenized or detokenized?
The text should always be detokenized. We run our own tokenization (in most cases XLM-R tokenizer)
Btw as a sanity check I just tried:
from transformers import XLMRobertaModel
model = XLMRobertaModel.from_pretrained("xlm-roberta-large")
And the warnings are exactly the same. This is the expected behaviour
This is because XLM-R in hugging face is an XLMRobertaForMaskedLM
model class. I don't care about the prediction head in COMET and I initialize instead a plain XLMRobertaModel
(the base architecture without the LM head)
Great! That helps a lot!
❓ Questions and Help
What is your question?
When I run "comet-score -s test.en-zh.en -t decoder-out -r test.en-zh.zh", I got the following warnings. Is that normal? or am I missing something?
/root/.cache/torch/unbabel_comet/wmt20-comet-da//checkpoints/model.ckpt Some weights of the model checkpoint at xlm-roberta-large were not used when initializing XLMRobertaModel: ['lm_head.bias', 'roberta.pooler.dense.weight', 'roberta.pooler.dense.bias', 'lm_head.dense.weight', 'lm_head.layer_norm.bias', 'lm_head.decoder.weight', 'lm_head.dense.bias', 'lm_head.layer_norm.weight'] This IS expected if you are initializing XLMRobertaModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model). This IS NOT expected if you are initializing XLMRobertaModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). Encoder model frozen. /usr/local/python3/lib/python3.8/site-packages/torch/nn/modules/container.py:435: UserWarning: Setting attributes on ParameterList is not supported. warnings.warn("Setting attributes on ParameterList is not supported.") GPU available: True, used: True
What's your environment?