brightmart / albert_zh

A LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS, 海量中文预训练ALBERT模型
https://arxiv.org/pdf/1909.11942.pdf
3.94k stars 753 forks source link

直接运行Similarity.py报错,模型restore失败 #131

Open justein opened 4 years ago

justein commented 4 years ago

E:\Python36\python.exe D:/LynGithub/albert_zh/similarity.py 2020-04-21 10:59:57.951576: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_100.dll'; dlerror: cudart64_100.dll not found 2020-04-21 10:59:57.951971: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. WARNING:tensorflow:From D:\LynGithub\albert_zh\args.py:4: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead.

WARNING:tensorflow:From D:\LynGithub\albert_zh\args.py:4: The name tf.logging.INFO is deprecated. Please use tf.compat.v1.logging.INFO instead.

WARNING:tensorflow:From D:\LynGithub\albert_zh\optimization_finetuning.py:87: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.

WARNING:tensorflow:From D:\LynGithub\albert_zh\tokenization.py:125: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.

WARNING:tensorflow:From D:/LynGithub/albert_zh/similarity.py:118: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

INFO:tensorflow:Using config: {'_model_dir': 'D:\LynGithub\albert_zh\albert_lcqmc_checkpoints/', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': , '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x0000023BBB310208>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1} INFO:tensorflow:Calling model_fn. WARNING:tensorflow:From D:/LynGithub/albert_zh/similarity.py:61: The name tf.logging.info is deprecated. Please use tf.compat.v1.logging.info instead.

INFO:tensorflow: Features INFO:tensorflow: name = input_ids, shape = (?, 128) INFO:tensorflow: name = input_mask, shape = (?, 128) INFO:tensorflow: name = label_ids, shape = (1,) INFO:tensorflow: name = segment_ids, shape = (?, 128) WARNING:tensorflow:From D:\LynGithub\albert_zh\modeling.py:171: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.

WARNING:tensorflow:From D:\LynGithub\albert_zh\modeling.py:482: The name tf.get_variable is deprecated. Please use tf.compat.v1.get_variable instead.

embedding_lookup_factorized. factorized embedding parameterization is used. WARNING:tensorflow:From D:\LynGithub\albert_zh\modeling.py:569: The name tf.assert_less_equal is deprecated. Please use tf.compat.v1.assert_less_equal instead.

WARNING:tensorflow: The TensorFlow contrib module will not be included in TensorFlow 2.0. For more information, please see:

ln_type: postln old structure of transformer.use: transformer_model,which use post-LN WARNING:tensorflow:From D:\LynGithub\albert_zh\modeling.py:750: dense (from tensorflow.python.layers.core) is deprecated and will be removed in a future version. Instructions for updating: Use keras.layers.Dense instead. WARNING:tensorflow:From E:\Python36\lib\site-packages\tensorflow_core\python\layers\core.py:187: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version. Instructions for updating: Please use layer.__call__ method instead. ln_type is postln or other,do nothing. WARNING:tensorflow:From D:/LynGithub/albert_zh/similarity.py:76: The name tf.trainable_variables is deprecated. Please use tf.compat.v1.trainable_variables instead.

WARNING:tensorflow:From D:/LynGithub/albert_zh/similarity.py:82: The name tf.train.init_from_checkpoint is deprecated. Please use tf.compat.v1.train.init_from_checkpoint instead.

INFO:tensorflow: Trainable Variables INFO:tensorflow: name = bert/embeddings/word_embeddings:0, shape = (21128, 128), INIT_FROM_CKPT INFO:tensorflow: name = bert/embeddings/word_embeddings_2:0, shape = (128, 312), INIT_FROM_CKPT INFO:tensorflow: name = bert/embeddings/token_type_embeddings:0, shape = (2, 312), INIT_FROM_CKPT INFO:tensorflow: name = bert/embeddings/position_embeddings:0, shape = (512, 312), INIT_FROM_CKPT INFO:tensorflow: name = bert/embeddings/LayerNorm/beta:0, shape = (312,), INIT_FROM_CKPT INFO:tensorflow: name = bert/embeddings/LayerNorm/gamma:0, shape = (312,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_shared/attention/self/query/kernel:0, shape = (312, 312), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_shared/attention/self/query/bias:0, shape = (312,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_shared/attention/self/key/kernel:0, shape = (312, 312), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_shared/attention/self/key/bias:0, shape = (312,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_shared/attention/self/value/kernel:0, shape = (312, 312), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_shared/attention/self/value/bias:0, shape = (312,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_shared/attention/output/dense/kernel:0, shape = (312, 312), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_shared/attention/output/dense/bias:0, shape = (312,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_shared/attention/output/LayerNorm/beta:0, shape = (312,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_shared/attention/output/LayerNorm/gamma:0, shape = (312,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_shared/intermediate/dense/kernel:0, shape = (312, 1248), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_shared/intermediate/dense/bias:0, shape = (1248,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_shared/output/dense/kernel:0, shape = (1248, 312), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_shared/output/dense/bias:0, shape = (312,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_shared/output/LayerNorm/beta:0, shape = (312,), INIT_FROM_CKPT INFO:tensorflow: name = bert/encoder/layer_shared/output/LayerNorm/gamma:0, shape = (312,), INIT_FROM_CKPT INFO:tensorflow: name = bert/pooler/dense/kernel:0, shape = (312, 312), INIT_FROM_CKPT INFO:tensorflow: name = bert/pooler/dense/bias:0, shape = (312,), INIT_FROM_CKPT INFO:tensorflow: name = output_weights:0, shape = (2, 312) INFO:tensorflow: name = output_bias:0, shape = (2,) INFO:tensorflow:Done calling model_fn. WARNING:tensorflow:From E:\Python36\lib\site-packages\tensorflow_core\python\ops\array_ops.py:1475: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.where in 2.0, which has the same broadcast rule as np.where INFO:tensorflow:Graph was finalized. 2020-04-21 11:00:01.159847: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 2020-04-21 11:00:01.163260: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'nvcuda.dll'; dlerror: nvcuda.dll not found 2020-04-21 11:00:01.163463: E tensorflow/stream_executor/cuda/cuda_driver.cc:318] failed call to cuInit: UNKNOWN ERROR (303) 2020-04-21 11:00:01.167120: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: DESKTOP-F85I308 2020-04-21 11:00:01.167478: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: DESKTOP-F85I308 INFO:tensorflow:Restoring parameters from D:\LynGithub\albert_zh\albert_lcqmc_checkpoints/albert_model.ckpt 2020-04-21 11:00:01.217684: W tensorflow/core/framework/op_kernel.cc:1651] OP_REQUIRES failed at save_restore_v2_ops.cc:184 : Not found: Key global_step not found in checkpoint Traceback (most recent call last): File "E:\Python36\lib\site-packages\tensorflow_core\python\client\session.py", line 1365, in _do_call return fn(*args) File "E:\Python36\lib\site-packages\tensorflow_core\python\client\session.py", line 1350, in _run_fn target_list, run_metadata) File "E:\Python36\lib\site-packages\tensorflow_core\python\client\session.py", line 1443, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.NotFoundError: Key global_step not found in checkpoint [[{{node save/RestoreV2}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "E:\Python36\lib\site-packages\tensorflow_core\python\training\saver.py", line 1290, in restore {self.saver_def.filename_tensor_name: save_path}) File "E:\Python36\lib\site-packages\tensorflow_core\python\client\session.py", line 956, in run run_metadata_ptr) File "E:\Python36\lib\site-packages\tensorflow_core\python\client\session.py", line 1180, in _run feed_dict_tensor, options, run_metadata) File "E:\Python36\lib\site-packages\tensorflow_core\python\client\session.py", line 1359, in _do_run run_metadata) File "E:\Python36\lib\site-packages\tensorflow_core\python\client\session.py", line 1384, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.NotFoundError: Key global_step not found in checkpoint [[node save/RestoreV2 (defined at E:\Python36\lib\site-packages\tensorflow_core\python\framework\ops.py:1748) ]]

Original stack trace for 'save/RestoreV2': File "D:/LynGithub/albert_zh/similarity.py", line 274, in sim.predict_sentences([("我喜欢妈妈做的汤", "妈妈做的汤我很喜欢喝")]) File "D:/LynGithub/albert_zh/similarity.py", line 129, in predict_sentences for i in results: File "E:\Python36\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 638, in predict hooks=all_hooks) as mon_sess: File "E:\Python36\lib\site-packages\tensorflow_core\python\training\monitored_session.py", line 1014, in init stop_grace_period_secs=stop_grace_period_secs) File "E:\Python36\lib\site-packages\tensorflow_core\python\training\monitored_session.py", line 725, in init self._sess = _RecoverableSession(self._coordinated_creator) File "E:\Python36\lib\site-packages\tensorflow_core\python\training\monitored_session.py", line 1207, in init _WrappedSession.init(self, self._create_session()) File "E:\Python36\lib\site-packages\tensorflow_core\python\training\monitored_session.py", line 1212, in _create_session return self._sess_creator.create_session() File "E:\Python36\lib\site-packages\tensorflow_core\python\training\monitored_session.py", line 878, in create_session self.tf_sess = self._session_creator.create_session() File "E:\Python36\lib\site-packages\tensorflow_core\python\training\monitored_session.py", line 638, in create_session self._scaffold.finalize() File "E:\Python36\lib\site-packages\tensorflow_core\python\training\monitored_session.py", line 229, in finalize self._saver = training_saver._get_saver_or_default() # pylint: disable=protected-access File "E:\Python36\lib\site-packages\tensorflow_core\python\training\saver.py", line 599, in _get_saver_or_default saver = Saver(sharded=True, allow_empty=True) File "E:\Python36\lib\site-packages\tensorflow_core\python\training\saver.py", line 828, in init self.build() File "E:\Python36\lib\site-packages\tensorflow_core\python\training\saver.py", line 840, in build self._build(self._filename, build_save=True, build_restore=True) File "E:\Python36\lib\site-packages\tensorflow_core\python\training\saver.py", line 878, in _build build_restore=build_restore) File "E:\Python36\lib\site-packages\tensorflow_core\python\training\saver.py", line 502, in _build_internal restore_sequentially, reshape) File "E:\Python36\lib\site-packages\tensorflow_core\python\training\saver.py", line 381, in _AddShardedRestoreOps name="restore_shard")) File "E:\Python36\lib\site-packages\tensorflow_core\python\training\saver.py", line 328, in _AddRestoreOps restore_sequentially) File "E:\Python36\lib\site-packages\tensorflow_core\python\training\saver.py", line 575, in bulk_restore return io_ops.restore_v2(filename_tensor, names, slices, dtypes) File "E:\Python36\lib\site-packages\tensorflow_core\python\ops\gen_io_ops.py", line 1696, in restore_v2 name=name) File "E:\Python36\lib\site-packages\tensorflow_core\python\framework\op_def_library.py", line 794, in _apply_op_helper op_def=op_def) File "E:\Python36\lib\site-packages\tensorflow_core\python\util\deprecation.py", line 507, in new_func return func(*args, **kwargs) File "E:\Python36\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3357, in create_op attrs, op_def, compute_device) File "E:\Python36\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3426, in _create_op_internal op_def=op_def) File "E:\Python36\lib\site-packages\tensorflow_core\python\framework\ops.py", line 1748, in init self._traceback = tf_stack.extract_stack()

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "E:\Python36\lib\site-packages\tensorflow_core\python\training\saver.py", line 1300, in restore names_to_keys = object_graph_key_mapping(save_path) File "E:\Python36\lib\site-packages\tensorflow_core\python\training\saver.py", line 1618, in object_graph_key_mapping object_graph_string = reader.get_tensor(trackable.OBJECT_GRAPH_PROTO_KEY) File "E:\Python36\lib\site-packages\tensorflow_core\python\pywrap_tensorflow_internal.py", line 915, in get_tensor return CheckpointReader_GetTensor(self, compat.as_bytes(tensor_str)) tensorflow.python.framework.errors_impl.NotFoundError: Key _CHECKPOINTABLE_OBJECT_GRAPH not found in checkpoint

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "D:/LynGithub/albert_zh/similarity.py", line 274, in sim.predict_sentences([("我喜欢妈妈做的汤", "妈妈做的汤我很喜欢喝")]) File "D:/LynGithub/albert_zh/similarity.py", line 129, in predict_sentences for i in results: File "E:\Python36\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 638, in predict hooks=all_hooks) as mon_sess: File "E:\Python36\lib\site-packages\tensorflow_core\python\training\monitored_session.py", line 1014, in init stop_grace_period_secs=stop_grace_period_secs) File "E:\Python36\lib\site-packages\tensorflow_core\python\training\monitored_session.py", line 725, in init self._sess = _RecoverableSession(self._coordinated_creator) File "E:\Python36\lib\site-packages\tensorflow_core\python\training\monitored_session.py", line 1207, in init _WrappedSession.init(self, self._create_session()) File "E:\Python36\lib\site-packages\tensorflow_core\python\training\monitored_session.py", line 1212, in _create_session return self._sess_creator.create_session() File "E:\Python36\lib\site-packages\tensorflow_core\python\training\monitored_session.py", line 878, in create_session self.tf_sess = self._session_creator.create_session() File "E:\Python36\lib\site-packages\tensorflow_core\python\training\monitored_session.py", line 647, in create_session init_fn=self._scaffold.init_fn) File "E:\Python36\lib\site-packages\tensorflow_core\python\training\session_manager.py", line 290, in prepare_session config=config) File "E:\Python36\lib\site-packages\tensorflow_core\python\training\session_manager.py", line 204, in _restore_checkpoint saver.restore(sess, checkpoint_filename_with_path) File "E:\Python36\lib\site-packages\tensorflow_core\python\training\saver.py", line 1306, in restore err, "a Variable name or other graph key that is missing") tensorflow.python.framework.errors_impl.NotFoundError: Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Key global_step not found in checkpoint [[node save/RestoreV2 (defined at E:\Python36\lib\site-packages\tensorflow_core\python\framework\ops.py:1748) ]]

Original stack trace for 'save/RestoreV2': File "D:/LynGithub/albert_zh/similarity.py", line 274, in sim.predict_sentences([("我喜欢妈妈做的汤", "妈妈做的汤我很喜欢喝")]) File "D:/LynGithub/albert_zh/similarity.py", line 129, in predict_sentences for i in results: File "E:\Python36\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 638, in predict hooks=all_hooks) as mon_sess: File "E:\Python36\lib\site-packages\tensorflow_core\python\training\monitored_session.py", line 1014, in init stop_grace_period_secs=stop_grace_period_secs) File "E:\Python36\lib\site-packages\tensorflow_core\python\training\monitored_session.py", line 725, in init self._sess = _RecoverableSession(self._coordinated_creator) File "E:\Python36\lib\site-packages\tensorflow_core\python\training\monitored_session.py", line 1207, in init _WrappedSession.init(self, self._create_session()) File "E:\Python36\lib\site-packages\tensorflow_core\python\training\monitored_session.py", line 1212, in _create_session return self._sess_creator.create_session() File "E:\Python36\lib\site-packages\tensorflow_core\python\training\monitored_session.py", line 878, in create_session self.tf_sess = self._session_creator.create_session() File "E:\Python36\lib\site-packages\tensorflow_core\python\training\monitored_session.py", line 638, in create_session self._scaffold.finalize() File "E:\Python36\lib\site-packages\tensorflow_core\python\training\monitored_session.py", line 229, in finalize self._saver = training_saver._get_saver_or_default() # pylint: disable=protected-access File "E:\Python36\lib\site-packages\tensorflow_core\python\training\saver.py", line 599, in _get_saver_or_default saver = Saver(sharded=True, allow_empty=True) File "E:\Python36\lib\site-packages\tensorflow_core\python\training\saver.py", line 828, in init self.build() File "E:\Python36\lib\site-packages\tensorflow_core\python\training\saver.py", line 840, in build self._build(self._filename, build_save=True, build_restore=True) File "E:\Python36\lib\site-packages\tensorflow_core\python\training\saver.py", line 878, in _build build_restore=build_restore) File "E:\Python36\lib\site-packages\tensorflow_core\python\training\saver.py", line 502, in _build_internal restore_sequentially, reshape) File "E:\Python36\lib\site-packages\tensorflow_core\python\training\saver.py", line 381, in _AddShardedRestoreOps name="restore_shard")) File "E:\Python36\lib\site-packages\tensorflow_core\python\training\saver.py", line 328, in _AddRestoreOps restore_sequentially) File "E:\Python36\lib\site-packages\tensorflow_core\python\training\saver.py", line 575, in bulk_restore return io_ops.restore_v2(filename_tensor, names, slices, dtypes) File "E:\Python36\lib\site-packages\tensorflow_core\python\ops\gen_io_ops.py", line 1696, in restore_v2 name=name) File "E:\Python36\lib\site-packages\tensorflow_core\python\framework\op_def_library.py", line 794, in _apply_op_helper op_def=op_def) File "E:\Python36\lib\site-packages\tensorflow_core\python\util\deprecation.py", line 507, in new_func return func(*args, **kwargs) File "E:\Python36\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3357, in create_op attrs, op_def, compute_device) File "E:\Python36\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3426, in _create_op_internal op_def=op_def) File "E:\Python36\lib\site-packages\tensorflow_core\python\framework\ops.py", line 1748, in init self._traceback = tf_stack.extract_stack()

Process finished with exit code 1