nghuyong / ERNIE-Pytorch

ERNIE Pytorch Version
MIT License
915 stars 120 forks source link

vocab_size in "config.json" is not equal to the length of "vocab.txt" #11

Closed linjianz closed 5 years ago

linjianz commented 5 years ago

Hello, thank you so much for your sharing. But When I test the converted ERNIE model using pytorch-transformers, the performance on cloze task is very poor. I found that the value of vocab_size in "config.json" is 18000, but the length of "vocab.txt" is 17964. So I guess whether this difference caused the poor performance or the model parameters.

nghuyong commented 5 years ago

waw, can you report the result of vocab.txt length on 18000 and 17964 in your task

linjianz commented 5 years ago

Thank you for your reply. I've found that the problem is not caused by the vocab_size. It is caused by the dense weights in BertPredictionHeadTransform seem not defined in ERNIE. So I guess maybe we should set is_prediction=False in convert_ernie_to_pytorch.py line 20 to True to generate new pytorch-model? However, I met the following problem:

===================extract weights start====================
Traceback (most recent call last):
  File "convert_ernie_to_pytorch.py", line 201, in <module>
    state_dict = extract_weights(args)
  File "convert_ernie_to_pytorch.py", line 159, in extract_weights
    fluid_array = np.array(fluid_tensor, dtype=np.float32)
TypeError: __array__(): incompatible function arguments. The following argument types are supported:
    1. (self: paddle.fluid.core_avx.Tensor) -> array

Invoked with: <paddle.fluid.core_avx.LoDTensor object at 0x7f6c489ecc70>, dtype('float32')
linjianz commented 5 years ago

pip install paddlepaddle-gpu==1.4.0.post87 solved the TypeError problem.

nghuyong commented 5 years ago

Thanks,may be you can request a PR and fix this problem

linjianz commented 5 years ago

I found that ERNIE have already released the weights for masked LM. Just add the following map list in convert_ernie_to_pytorch.py

{
        'mask_lm_trans_fc.b_0': 'cls.predictions.transform.dense.bias',
        'mask_lm_trans_fc.w_0': 'cls.predictions.transform.dense.weight',
        'mask_lm_trans_layer_norm_scale': 'cls.predictions.transform.LayerNorm.weight',
        'mask_lm_trans_layer_norm_bias': 'cls.predictions.transform.LayerNorm.bias',
        'mask_lm_out_fc.b_0': 'cls.predictions.bias'
}
nghuyong commented 5 years ago

OK,can you report the performance of ERNIE after adding the map list you fund.

linjianz commented 5 years ago

I test the cloze task with four model: bert-base, bert-wwm / bert-wwm-ext from Chinese-BERT-wwm and this converted ERNIE model, the performance of ERNIE is comparable as mentioned in the original paper.


input: [MASK] [MASK] [MASK] 是中国神魔小说的经典之作,与《三国演义》《水浒传》《红楼梦》并称为中国古典四大名。
output:
{
        "bert-base": "《 神 》",
        "bert-wwm": "天 神 奇",
        "bert-wwm-ext": "西 游 记",
        "ernie": "西 游 记"
}
nghuyong commented 5 years ago

Cool, you can request a PR, and add your test to README

lxy444 commented 4 years ago

@linjian93 Hello, I have tested the result as follows:

input: [MASK] [MASK] [MASK] 是中国神魔小说的经典之作,与《三国演义》《水浒传》《红楼梦》并称为中国古典四大名。
output:
{
        "bert-base": "《 神 》",
        "bert-wwm": "天 神 奇",
        "bert-wwm-ext": "西 游 记",
}

But I can't get the result of ernie. Can you share your code or your model?

lxy444 commented 4 years ago

For ernie, I got the result 《 西游, not 西游记.