Closed wcc17864158993 closed 3 years ago
Hi @wcc17864158993 if the tsv is missing it means that the prediction script failed. The logs for it are mapped into a file called error
at the root, can you check if you have such a file and share the content?
Hello @eisenjulian the file error
content:
WARNING:tensorflow:From /opt/conda/lib/python3.6/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
I1210 14:22:34.871476 140232031508288 run_task_main.py:162] is_built_with_cuda: True
2020-12-10 14:22:34.871764: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA
2020-12-10 14:22:34.887337: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 2300000000 Hz
2020-12-10 14:22:34.894417: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x56221abd8fb0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-12-10 14:22:34.894504: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-12-10 14:22:34.896547: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-12-10 14:22:34.957768: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:3d:00.0 name: GeForce RTX 2080 Ti computeCapability: 7.5
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.76GiB deviceMemoryBandwidth: 573.69GiB/s
2020-12-10 14:22:34.958055: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-12-10 14:22:34.960612: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-12-10 14:22:34.962806: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-12-10 14:22:34.963238: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-12-10 14:22:34.965400: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-12-10 14:22:34.966510: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-12-10 14:22:34.971252: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-12-10 14:22:34.975233: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-12-10 14:22:34.975331: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-12-10 14:22:35.216099: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-12-10 14:22:35.216160: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] 0
2020-12-10 14:22:35.216171: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0: N
2020-12-10 14:22:35.220232: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/device:GPU:0 with 10202 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:3d:00.0, compute capability: 7.5)
2020-12-10 14:22:35.222528: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x56222003b500 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-12-10 14:22:35.222562: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce RTX 2080 Ti, Compute Capability 7.5
I1210 14:22:35.223332 140232031508288 run_task_main.py:162] is_gpu_available: True
2020-12-10 14:22:35.225051: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:3d:00.0 name: GeForce RTX 2080 Ti computeCapability: 7.5
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.76GiB deviceMemoryBandwidth: 573.69GiB/s
2020-12-10 14:22:35.225124: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-12-10 14:22:35.225152: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-12-10 14:22:35.225167: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-12-10 14:22:35.225180: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-12-10 14:22:35.225194: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-12-10 14:22:35.225207: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-12-10 14:22:35.225221: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-12-10 14:22:35.227685: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
I1210 14:22:35.227826 140232031508288 run_task_main.py:162] GPUs: [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
I1210 14:22:35.227957 140232031508288 run_task_main.py:162] Training or predicting ...
I1210 14:22:35.964963 140232031508288 modeling.py:491] position: Tensor("bert/embeddings/ExpandDims:0", shape=(1, 12), dtype=int32)
I1210 14:22:35.973732 140232031508288 modeling.py:493] batched_position: Tensor("bert/embeddings/Repeat/Reshape:0", shape=(?, 12), dtype=int32)
I1210 14:22:35.973974 140232031508288 modeling.py:494] token_type_ids: Tensor("IteratorGetNext:2", shape=(?, 12), dtype=int32)
2020-12-10 14:22:37.752212: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:3d:00.0 name: GeForce RTX 2080 Ti computeCapability: 7.5
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.76GiB deviceMemoryBandwidth: 573.69GiB/s
2020-12-10 14:22:37.752381: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-12-10 14:22:37.752419: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-12-10 14:22:37.752454: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-12-10 14:22:37.752480: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-12-10 14:22:37.752501: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-12-10 14:22:37.752523: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-12-10 14:22:37.752545: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-12-10 14:22:37.755123: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-12-10 14:22:37.755248: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-12-10 14:22:37.755260: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] 0
2020-12-10 14:22:37.755267: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0: N
2020-12-10 14:22:37.758013: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10202 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:3d:00.0, compute capability: 7.5)
2020-12-10 14:22:39.593654: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
I1210 14:22:40.108834 140232031508288 modeling.py:491] position: Tensor("bert/embeddings/ExpandDims:0", shape=(1, 12), dtype=int32)
I1210 14:22:40.117084 140232031508288 modeling.py:493] batched_position: Tensor("bert/embeddings/Repeat/Reshape:0", shape=(?, 12), dtype=int32)
I1210 14:22:40.117332 140232031508288 modeling.py:494] token_type_ids: Tensor("IteratorGetNext:2", shape=(?, 12), dtype=int32)
2020-12-10 14:22:41.624194: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:3d:00.0 name: GeForce RTX 2080 Ti computeCapability: 7.5
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.76GiB deviceMemoryBandwidth: 573.69GiB/s
2020-12-10 14:22:41.624314: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-12-10 14:22:41.624332: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-12-10 14:22:41.624347: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-12-10 14:22:41.624382: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-12-10 14:22:41.624400: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-12-10 14:22:41.624417: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-12-10 14:22:41.624440: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-12-10 14:22:41.629974: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-12-10 14:22:41.630030: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-12-10 14:22:41.630040: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] 0
2020-12-10 14:22:41.630047: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0: N
2020-12-10 14:22:41.637384: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10202 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:3d:00.0, compute capability: 7.5)
2020-12-10 14:22:42.720438: W tensorflow/core/framework/op_kernel.cc:1753] OP_REQUIRES failed at example_parsing_ops.cc:94 : Invalid argument: Key: input_mask. Can't parse serialized Example.
2020-12-10 14:22:42.720506: W tensorflow/core/framework/op_kernel.cc:1753] OP_REQUIRES failed at example_parsing_ops.cc:94 : Invalid argument: Key: segment_ids. Can't parse serialized Example.
2020-12-10 14:22:42.720438: W tensorflow/core/framework/op_kernel.cc:1753] OP_REQUIRES failed at example_parsing_ops.cc:94 : Invalid argument: Key: inv_column_ranks. Can't parse serialized Example.
2020-12-10 14:22:42.720438: W tensorflow/core/framework/op_kernel.cc:1753] OP_REQUIRES failed at example_parsing_ops.cc:94 : Invalid argument: Key: prev_label_ids. Can't parse serialized Example.
2020-12-10 14:22:42.720438: W tensorflow/core/framework/op_kernel.cc:1753] OP_REQUIRES failed at example_parsing_ops.cc:94 : Invalid argument: Key: numeric_values. Can't parse serialized Example.
2020-12-10 14:22:42.720464: W tensorflow/core/framework/op_kernel.cc:1753] OP_REQUIRES failed at example_parsing_ops.cc:94 : Invalid argument: Key: row_ids. Can't parse serialized Example.
2020-12-10 14:22:42.720466: W tensorflow/core/framework/op_kernel.cc:1753] OP_REQUIRES failed at example_parsing_ops.cc:94 : Invalid argument: Key: input_mask. Can't parse serialized Example.
2020-12-10 14:22:42.720483: W tensorflow/core/framework/op_kernel.cc:1753] OP_REQUIRES failed at example_parsing_ops.cc:94 : Invalid argument: Key: column_ranks. Can't parse serialized Example.
2020-12-10 14:22:42.720438: W tensorflow/core/framework/op_kernel.cc:1753] OP_REQUIRES failed at example_parsing_ops.cc:94 : Invalid argument: Key: input_mask. Can't parse serialized Example.
2020-12-10 14:22:42.720522: W tensorflow/core/framework/op_kernel.cc:1753] OP_REQUIRES failed at example_parsing_ops.cc:94 : Invalid argument: Key: segment_ids. Can't parse serialized Example.
2020-12-10 14:22:42.720572: W tensorflow/core/framework/op_kernel.cc:1753] OP_REQUIRES failed at example_parsing_ops.cc:94 : Invalid argument: Key: prev_label_ids. Can't parse serialized Example.
2020-12-10 14:22:42.720758: W tensorflow/core/framework/op_kernel.cc:1753] OP_REQUIRES failed at example_parsing_ops.cc:94 : Invalid argument: Key: input_mask. Can't parse serialized Example.
Traceback (most recent call last):
File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1365, in _do_call
return fn(*args)
File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1350, in _run_fn
target_list, run_metadata)
File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1443, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: Key: inv_column_ranks. Can't parse serialized Example.
[[{{node ParseSingleExample/ParseExample/ParseExampleV2}}]]
[[IteratorGetNext]]
[[bert/encoder/Shape/_899]]
(1) Invalid argument: Key: inv_column_ranks. Can't parse serialized Example.
[[{{node ParseSingleExample/ParseExample/ParseExampleV2}}]]
[[IteratorGetNext]]
0 successful operations.
0 derived errors ignored.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "tapas-master/tapas/run_task_main.py", line 821, in <module>
app.run(main)
File "/opt/conda/lib/python3.6/site-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/opt/conda/lib/python3.6/site-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "tapas-master/tapas/run_task_main.py", line 806, in main
loop_predict=FLAGS.loop_predict,
File "tapas-master/tapas/run_task_main.py", line 504, in _train_and_predict
global_step=current_step,
File "tapas-master/tapas/run_task_main.py", line 555, in _predict
global_step=None,
File "tapas-master/tapas/run_task_main.py", line 616, in _predict_for_set
output_token_probabilities=False)
File "/data-output/tapas-master/tapas/experiments/prediction_utils.py", line 397, in write_predictions
for prediction in predictions:
File "/opt/conda/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3126, in predict
rendezvous.raise_errors()
File "/opt/conda/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/error_handling.py", line 150, in raise_errors
six.reraise(typ, value, traceback)
File "/opt/conda/lib/python3.6/site-packages/six.py", line 703, in reraise
raise value
File "/opt/conda/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3120, in predict
yield_single_examples=yield_single_examples):
File "/opt/conda/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 631, in predict
preds_evaluated = mon_sess.run(predictions)
File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 778, in run
run_metadata=run_metadata)
File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1283, in run
run_metadata=run_metadata)
File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1384, in run
raise six.reraise(*original_exc_info)
File "/opt/conda/lib/python3.6/site-packages/six.py", line 703, in reraise
raise value
File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1369, in run
return self._sess.run(*args, **kwargs)
File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1442, in run
run_metadata=run_metadata)
File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1200, in run
return self._sess.run(*args, **kwargs)
File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 958, in run
run_metadata_ptr)
File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1181, in _run
feed_dict_tensor, options, run_metadata)
File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1359, in _do_run
run_metadata)
File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1384, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: Key: inv_column_ranks. Can't parse serialized Example.
[[{{node ParseSingleExample/ParseExample/ParseExampleV2}}]]
[[IteratorGetNext]]
[[bert/encoder/Shape/_899]]
(1) Invalid argument: Key: inv_column_ranks. Can't parse serialized Example.
[[{{node ParseSingleExample/ParseExample/ParseExampleV2}}]]
[[IteratorGetNext]]
0 successful operations.
0 derived errors ignored.
Thanks, I'm still not sure the reason for the problem is. It seems that something is wrong in the tf record that was created for prediction, since the inv_column_ranks feature is missing. Can you try doing:
for example in tf.io.tf_record_iterator("data-output/predict/tabfact/tf_examples/test.tfrecord"):
print(tf.train.Example.FromString(example))
And the same for dev.tfrecord?
@eisenjulian ,I have solved this problem. Thank you for your help!
Hi, I have found a question when I was running the codes with "predict" mode.
If I used the following codes:
the error would be:
This is my directory structure:
But when I was running codes on the colab, there are no mistakes. Could you please help me with this issue?