Open tianke0711 opened 5 years ago
@tianke0711 最后找到了吗 ,我将 data文件夹的那2个tsv 用金山快译 翻译成中文了 然后跑了2个小时以后 突然报错了 有可能和你的一样 类型对应不上 , dev 和train 两个文件分别是什么作用的 谢谢
我现在新的数据是7大类英文文本。train, dev我都变成了 样本数据一样的格式。 但是我跑的时候有错误。好像是7大类与10大类不符合。这个代码10大类分类,我想知道哪里修改。
[Caused by op 'save/Assign_602', defined at: File "run_classifier.py", line 929, in tf.app.run() File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run _sys.exit(main(argv)) File "run_classifier.py", line 848, in main estimator.train(input_fn=train_input_fn, max_steps=num_train_steps) File "/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2403, in train saving_listeners=saving_listeners File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 354, in train loss = self._train_model(input_fn, hooks, saving_listeners) File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1207, in _train_model return self._train_model_default(input_fn, hooks, saving_listeners) File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1241, in _train_model_default saving_listeners) File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1468, in _train_with_estimator_spec log_step_count_steps=log_step_count_steps) as mon_sess: File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 504, in MonitoredTrainingSession stop_grace_period_secs=stop_grace_period_secs) File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 921, in init stop_grace_period_secs=stop_grace_period_secs) File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 643, in init self._sess = _RecoverableSession(self._coordinated_creator) File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1107, in init _WrappedSession.init(self, self._create_session()) File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1112, in _create_session return self._sess_creator.create_session() File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 800, in create_session self.tf_sess = self._session_creator.create_session() File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 557, in create_session self._scaffold.finalize() File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 215, in finalize self._saver.build() File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1114, in build self._build(self._filename, build_save=True, build_restore=True) File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1151, in _build build_save=build_save, build_restore=build_restore) File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 789, in _build_internal restore_sequentially, reshape) File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 459, in _AddShardedRestoreOps name="restore_shard")) File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 428, in _AddRestoreOps assign_ops.append(saveable.restore(saveable_tensors, shapes)) File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 119, in restore self.op.get_shape().is_fully_defined()) File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/state_ops.py", line 221, in assign validate_shape=validate_shape) File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_state_ops.py", line 61, in assign use_locking=use_locking, name=name) File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func return func(*args, kwargs) File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3274, in create_op op_def=op_def) File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1770, in init** self._traceback = tf_stack.extract_stack()
InvalidArgumentError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:
Assign requires shapes of both tensors to match. lhs shape= [7,768] rhs shape= [10,768] [[node save/Assign_602 (defined at /anaconda3/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py:2403) = Assign[T=DT_FLOAT, _class=["loc:@output_weights/adam_v"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](output_weights/adam_v, save/RestoreV2:603)]]](url)
建议看下tutorial, 里面有提到https://mp.weixin.qq.com/s/XmeDjHSFI0UsQmKeOgwnyA,可能需要检查下数据
@tianke0711 最后找到了吗 ,我将 data文件夹的那2个tsv 用金山快译 翻译成中文了 然后跑了2个小时以后 突然报错了 有可能和你的一样 类型对应不上 , dev 和train 两个文件分别是什么作用的 谢谢
这个是英文版的bert, 中文需要multilingual 或 chinese
@ejld 恩 谢谢 我找到原因了 是train.tsv和 dev.tsv 类型对不上 一个dev是7种 另外一个train是10种 ,将之前训练的 model文件全部删掉 重新训练就没有报错了
@ejld 恩 谢谢 我找到原因了 是train.tsv和 dev.tsv 类型对不上 一个dev是7种 另外一个train是10种 ,将之前训练的 model文件全部删掉 重新训练就没有报错了
你删除了啥model文件啊 能否具体说一下。 @ejld 你知道我那个错误啥原因吗?咋修改啊。谢谢
@tianke0711 按照我后来用新的数据集跑训练遇到的 ,先将model整个文件夹删掉 因为训练之前那个label数量已经算进去了,如果你后来在训练集重新加一个新的 就会报错..
@tianke0711 dev那个文件 有些分类是 train没有的 你对比一下 然后就是少了3个分类
少了3个分类也会出错吗?
我现在新的数据是7大类英文文本。train, dev我都变成了 样本数据一样的格式。 但是我跑的时候有错误。好像是7大类与10大类不符合。这个代码10大类分类,我想知道哪里修改。
[Caused by op 'save/Assign_602', defined at: File "run_classifier.py", line 929, in
tf.app.run()
File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "run_classifier.py", line 848, in main
estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
File "/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2403, in train
saving_listeners=saving_listeners
File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 354, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1207, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1241, in _train_model_default
saving_listeners)
File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1468, in _train_with_estimator_spec
log_step_count_steps=log_step_count_steps) as mon_sess:
File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 504, in MonitoredTrainingSession
stop_grace_period_secs=stop_grace_period_secs)
File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 921, in init
stop_grace_period_secs=stop_grace_period_secs)
File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 643, in init
self._sess = _RecoverableSession(self._coordinated_creator)
File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1107, in init
_WrappedSession.init(self, self._create_session())
File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1112, in _create_session
return self._sess_creator.create_session()
File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 800, in create_session
self.tf_sess = self._session_creator.create_session()
File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 557, in create_session
self._scaffold.finalize()
File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 215, in finalize
self._saver.build()
File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1114, in build
self._build(self._filename, build_save=True, build_restore=True)
File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1151, in _build
build_save=build_save, build_restore=build_restore)
File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 789, in _build_internal
restore_sequentially, reshape)
File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 459, in _AddShardedRestoreOps
name="restore_shard"))
File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 428, in _AddRestoreOps
assign_ops.append(saveable.restore(saveable_tensors, shapes))
File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 119, in restore
self.op.get_shape().is_fully_defined())
File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/state_ops.py", line 221, in assign
validate_shape=validate_shape)
File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_state_ops.py", line 61, in assign
use_locking=use_locking, name=name)
File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
op_def=op_def)
File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1770, in init
self._traceback = tf_stack.extract_stack()
InvalidArgumentError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:
Assign requires shapes of both tensors to match. lhs shape= [7,768] rhs shape= [10,768] [[node save/Assign_602 (defined at /anaconda3/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py:2403) = Assign[T=DT_FLOAT, _class=["loc:@output_weights/adam_v"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](output_weights/adam_v, save/RestoreV2:603)]]](url)