google-research / electra

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
Apache License 2.0
2.31k stars 351 forks source link

Data loss: truncated record at 10035180 #72

Open jjkim-zz opened 4 years ago

jjkim-zz commented 4 years ago

When I run "python3 run_pretraining.py --data-dir $DATA_DIR --model-name electra_small_owt", I face the following error:

ERROR:tensorflow:Error recorded from training_loop: 2 root error(s) found. (0) Data loss: truncated record at 10035180 [[node IteratorGetNext (defined at /home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py:1748) ]] (1) Data loss: truncated record at 10035180 [[node IteratorGetNext (defined at /home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py:1748) ]] [[electra/embeddings_3/assert_less_equal/Assert/Assert/data_3/_9763]] 0 successful operations. 0 derived errors ignored.

Original stack trace for 'IteratorGetNext': File "run_pretraining.py", line 385, in main() File "run_pretraining.py", line 381, in main args.model_name, args.data_dir, *hparams)) File "run_pretraining.py", line 344, in train_or_eval max_steps=config.num_train_steps) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3030, in train saving_listeners=saving_listeners) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 370, in train loss = self._train_model(input_fn, hooks, saving_listeners) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1161, in _train_model return self._train_model_default(input_fn, hooks, saving_listeners) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1188, in _train_model_default input_fn, ModeKeys.TRAIN)) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1025, in _get_features_and_labels_from_input_fn self._call_input_fn(input_fn, mode)) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/util.py", line 65, in parse_input_fn_result result = iterator.get_next() File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_core/python/data/ops/iterator_ops.py", line 426, in get_next name=name) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_core/python/ops/gen_dataset_ops.py", line 2518, in iterator_get_next output_shapes=output_shapes, name=name) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper op_def=op_def) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_core/python/util/deprecation.py", line 507, in new_func return func(args, **kwargs) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 3357, in create_op attrs, op_def, compute_device) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal op_def=op_def) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1748, in init self._traceback = tf_stack.extract_stack()

Traceback (most recent call last): File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call return fn(*args) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn target_list, run_metadata) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.DataLossError: 2 root error(s) found. (0) Data loss: truncated record at 10035180 [[{{node IteratorGetNext}}]] (1) Data loss: truncated record at 10035180 [[{{node IteratorGetNext}}]] [[electra/embeddings_3/assert_less_equal/Assert/Assert/data_3/_9763]] 0 successful operations. 0 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "run_pretraining.py", line 385, in main() File "run_pretraining.py", line 381, in main args.model_name, args.data_dir, hparams)) File "run_pretraining.py", line 344, in train_or_eval max_steps=config.num_train_steps) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3035, in train rendezvous.raise_errors() File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/tpu/error_handling.py", line 136, in raise_errors six.reraise(typ, value, traceback) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/six.py", line 703, in reraise raise value File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3030, in train saving_listeners=saving_listeners) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 370, in train loss = self._train_model(input_fn, hooks, saving_listeners) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1161, in _train_model return self._train_model_default(input_fn, hooks, saving_listeners) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1195, in _train_model_default saving_listeners) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1494, in _train_with_estimatorspec , loss = mon_sess.run([estimator_spec.train_op, estimator_spec.loss]) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_core/python/training/monitored_session.py", line 754, in run run_metadata=run_metadata) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_core/python/training/monitored_session.py", line 1259, in run run_metadata=run_metadata) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_core/python/training/monitored_session.py", line 1360, in run raise six.reraise(original_exc_info) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/six.py", line 703, in reraise raise value File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_core/python/training/monitored_session.py", line 1345, in run return self._sess.run(args, kwargs) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_core/python/training/monitored_session.py", line 1418, in run run_metadata=run_metadata) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_core/python/training/monitored_session.py", line 1176, in run return self._sess.run(*args, **kwargs) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 956, in run run_metadata_ptr) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1180, in _run feed_dict_tensor, options, run_metadata) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1359, in _do_run run_metadata) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1384, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.DataLossError: 2 root error(s) found. (0) Data loss: truncated record at 10035180 [[node IteratorGetNext (defined at /home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py:1748) ]] (1) Data loss: truncated record at 10035180 [[node IteratorGetNext (defined at /home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py:1748) ]] [[electra/embeddings_3/assert_less_equal/Assert/Assert/data_3/_9763]] 0 successful operations. 0 derived errors ignored.

Original stack trace for 'IteratorGetNext': File "run_pretraining.py", line 385, in main() File "run_pretraining.py", line 381, in main args.model_name, args.data_dir, *hparams)) File "run_pretraining.py", line 344, in train_or_eval max_steps=config.num_train_steps) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3030, in train saving_listeners=saving_listeners) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 370, in train loss = self._train_model(input_fn, hooks, saving_listeners) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1161, in _train_model return self._train_model_default(input_fn, hooks, saving_listeners) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1188, in _train_model_default input_fn, ModeKeys.TRAIN)) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1025, in _get_features_and_labels_from_input_fn self._call_input_fn(input_fn, mode)) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/util.py", line 65, in parse_input_fn_result result = iterator.get_next() File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_core/python/data/ops/iterator_ops.py", line 426, in get_next name=name) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_core/python/ops/gen_dataset_ops.py", line 2518, in iterator_get_next output_shapes=output_shapes, name=name) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper op_def=op_def) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_core/python/util/deprecation.py", line 507, in new_func return func(args, **kwargs) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 3357, in create_op attrs, op_def, compute_device) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal op_def=op_def) File "/home/jjkim/anaconda3/envs/electra/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1748, in init self._traceback = tf_stack.extract_stack()

I am using TensorFlow 1.15 as instructed. Please kindly advise. Many thanks!

kanhuimin commented 4 years ago

Have you solved this problem?