ymcui / Chinese-BERT-wwm

Pre-Training with Whole Word Masking for Chinese BERT(中文BERT-wwm系列模型)
https://ieeexplore.ieee.org/document/9599397
Apache License 2.0
9.69k stars 1.39k forks source link

tf2无法加载hfl / chinese-roberta-wwm-ext #190

Closed kscp123 closed 3 years ago

kscp123 commented 3 years ago

你好,我下载了对应的h5模型然后加载报错。 transformers:2.2.2 tensorflow:2.1

model = TFBertModel.from_pretrained(path, output_hidden_states=True)

path是模型路径,这个可以成功加载bert-base,但是加载roberta就会报以下错误

File "classifiacation.py", line 160, in <module> transformer_layer = TFBertModel.from_pretrained(MODEL, output_hidden_states=True) File "/usr/local/lib/python3.6/dist-packages/transformers/modeling_tf_utils.py", line 309, in from_pretrained model.load_weights(resolved_archive_file, by_name=True) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/training.py", line 234, in load_weights return super(Model, self).load_weights(filepath, by_name, skip_mismatch) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/network.py", line 1220, in load_weights f, self.layers, skip_mismatch=skip_mismatch) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/saving/hdf5_format.py", line 777, in load_weights_from_hdf5_group_by_name str(weight_values[i].shape) + '.') ValueError: Layer #0 (named "bert"), weight <tf.Variable 'tf_bertmodel/bert/encoder/layer._0/attention/self/query/kernel:0' shape=(768, 768) dtype=float32, numpy= array([[-0.01850067, -0.01887354, 0.00046411, ..., -0.02237962, 0.0132857 , -0.01035117], [-0.0011026 , -0.01686522, 0.00017086, ..., -0.01813387, -0.01236598, 0.01903026], [ 0.02472041, 0.02698529, -0.00301668, ..., -0.0238625 , 0.00780853, -0.01740931], ..., [-0.00500965, -0.0014657 , 0.02582165, ..., -0.00806629, -0.01069776, 0.02885169], [ 0.03499781, 0.01101323, -0.03752618, ..., 0.01265424, -0.00410191, 0.01200508], [-0.00900458, 0.01460658, -0.0131218 , ..., -0.01634052, 0.02017507, -0.00059968]], dtype=float32)> has shape (768, 768), but the saved weight has shape (768, 12, 64).

kscp123 commented 3 years ago

我试了其他的比如ernie,也都是可以的,就是hfl系列的模型有这个错误

ymcui commented 3 years ago

transformers == 4.6.1 tensorflow == 2.4.1 测试通过。建议使用更高版本transformers库试一下。低版本transformers原本只支持PT版本,本repo中的TF2版本模型是后加的。

>>> from transformers import TFBertModel
>>> model = TFBertModel.from_pretrained('hfl/chinese-bert-wwm', output_hidden_states=True)
████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 409M/409M [00:07<00:00, 52.8MB/s]
All model checkpoint layers were used when initializing TFBertModel.

All the layers of TFBertModel were initialized from the model checkpoint at hfl/chinese-bert-wwm.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFBertModel for predictions without further training.
>>> exit()
kscp123 commented 3 years ago

transformers == 4.6.1 tensorflow == 2.4.1 测试通过。建议使用更高版本transformers库试一下。低版本transformers原本只支持PT版本,本repo中的TF2版本模型是后加的。

>>> from transformers import TFBertModel
>>> model = TFBertModel.from_pretrained('hfl/chinese-bert-wwm', output_hidden_states=True)
████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 409M/409M [00:07<00:00, 52.8MB/s]
All model checkpoint layers were used when initializing TFBertModel.

All the layers of TFBertModel were initialized from the model checkpoint at hfl/chinese-bert-wwm.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFBertModel for predictions without further training.
>>> exit()

好的那看来是版本问题,谢谢回复