Closed linjianz closed 5 years ago
waw, can you report the result of vocab.txt length on 18000 and 17964 in your task
Thank you for your reply. I've found that the problem is not caused by the vocab_size. It is caused by the dense weights in BertPredictionHeadTransform
seem not defined in ERNIE. So I guess maybe we should set is_prediction=False
in convert_ernie_to_pytorch.py line 20
to True
to generate new pytorch-model? However, I met the following problem:
===================extract weights start====================
Traceback (most recent call last):
File "convert_ernie_to_pytorch.py", line 201, in <module>
state_dict = extract_weights(args)
File "convert_ernie_to_pytorch.py", line 159, in extract_weights
fluid_array = np.array(fluid_tensor, dtype=np.float32)
TypeError: __array__(): incompatible function arguments. The following argument types are supported:
1. (self: paddle.fluid.core_avx.Tensor) -> array
Invoked with: <paddle.fluid.core_avx.LoDTensor object at 0x7f6c489ecc70>, dtype('float32')
pip install paddlepaddle-gpu==1.4.0.post87
solved the TypeError problem.
Thanks,may be you can request a PR and fix this problem
I found that ERNIE have already released the weights for masked LM. Just add the following map list in convert_ernie_to_pytorch.py
{
'mask_lm_trans_fc.b_0': 'cls.predictions.transform.dense.bias',
'mask_lm_trans_fc.w_0': 'cls.predictions.transform.dense.weight',
'mask_lm_trans_layer_norm_scale': 'cls.predictions.transform.LayerNorm.weight',
'mask_lm_trans_layer_norm_bias': 'cls.predictions.transform.LayerNorm.bias',
'mask_lm_out_fc.b_0': 'cls.predictions.bias'
}
OK,can you report the performance of ERNIE after adding the map list you fund.
I test the cloze task with four model: bert-base, bert-wwm / bert-wwm-ext from Chinese-BERT-wwm and this converted ERNIE model, the performance of ERNIE is comparable as mentioned in the original paper.
input: [MASK] [MASK] [MASK] 是中国神魔小说的经典之作,与《三国演义》《水浒传》《红楼梦》并称为中国古典四大名。
output:
{
"bert-base": "《 神 》",
"bert-wwm": "天 神 奇",
"bert-wwm-ext": "西 游 记",
"ernie": "西 游 记"
}
Cool, you can request a PR, and add your test to README
@linjian93 Hello, I have tested the result as follows:
input: [MASK] [MASK] [MASK] 是中国神魔小说的经典之作,与《三国演义》《水浒传》《红楼梦》并称为中国古典四大名。
output:
{
"bert-base": "《 神 》",
"bert-wwm": "天 神 奇",
"bert-wwm-ext": "西 游 记",
}
But I can't get the result of ernie
. Can you share your code or your model?
For ernie
, I got the result 《 西游
, not 西游记
.
Hello, thank you so much for your sharing. But When I test the converted ERNIE model using pytorch-transformers, the performance on cloze task is very poor. I found that the value of vocab_size in "config.json" is 18000, but the length of "vocab.txt" is 17964. So I guess whether this difference caused the poor performance or the model parameters.