huanghuidmml / tfbert

基于tensorflow1.x的预训练模型调用,支持单机多卡、梯度累积,XLA加速,混合精度。可灵活训练、验证、预测。
58 stars 11 forks source link

The problem happened when I use chinese_bert_chinese_wwm_L-12_H-768_A-12 model with run_elment_extract.py script. #4

Closed ericdoug-qi closed 3 years ago

ericdoug-qi commented 3 years ago

WARNING:tensorflow:From /opt/tfbert/tfbert/models/layers.py:28: dense (from tensorflow.python.layers.core) is deprecated and will be removed in a future version. Instructions for updating: Use keras.layers.Dense instead. Traceback (most recent call last): File "/opt/tfbert/run_element_extract.py", line 292, in main() File "/opt/tfbert/run_element_extract.py", line 253, in main args.model_dir if args.pretrained_checkpoint_path is None else args.pretrained_checkpoint_path) File "/opt/tfbert/tfbert/trainer.py", line 175, in from_pretrained utils.init_checkpoints(ckpt, True) File "/opt/tfbert/tfbert/utils.py", line 261, in init_checkpoints prefix=prefix) File "/opt/tfbert/tfbert/utils.py", line 239, in get_assignment_map_from_checkpoint init_vars = tf.train.list_variables(init_checkpoint) File "/opttensorflow_core/python/training/checkpoint_utils.py", line 97, in list_variables reader = load_checkpoint(ckpt_dir_or_file) File "/opttensorflow_core/python/training/checkpoint_utils.py", line 66, in load_checkpoint return pywrap_tensorflow.NewCheckpointReader(filename) File "/opttensorflow_core/python/pywrap_tensorflow_internal.py", line 873, in NewCheckpointReader return CheckpointReader(compat.as_bytes(filepattern)) File "/opttensorflow_core/python/pywrap_tensorflow_internal.py", line 885, in init this = _pywrap_tensorflow_internal.new_CheckpointReader(filename) tensorflow.python.framework.errors_impl.DataLossError: Unable to open table file /opt/models/bert/chinese_bert_chinese_wwm_L-12_H-768_A-12/publish/bert_model.ckpt.data-00000-of-00001: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?

huanghuidmml commented 3 years ago

Hello, there are several solutions: 1. Delete the checkpoint file in the folder, and then pass in the parameter model_dir. But you need to ensure that the folder contains config.json, vocab.txt and model.ckpt files. 2. Set pretrained_checkpoint_path to bert_model.ckpt file path. The form under the folder is roughly as follows: config.json vocab.txt model.ckpt.data-00000-of-00001 model.ckpt.index model.ckpt.meta

ericdoug-qi commented 3 years ago

thanks for your reply timely.