tensorflow / models

Models and examples built with TensorFlow
Other
77.05k stars 45.77k forks source link

tensorflow.python.framework.errors_impl.FailedPreconditionError: /content/training_demo/images/train; Is a directory [[{{node MultiDeviceIteratorGetNextFromShard}}]] [[RemoteCall]] [Op:IteratorGetNext] #10523

Closed clannoronha closed 2 years ago

clannoronha commented 2 years ago

I am trying to do a custom object detection from the following tutorial. https://www.youtube.com/watch?v=XoMiveY_1Z4&t=2179s

Around 37mins of the video

I however managed my way through various errors and stuck on this one. Please help

Below is the error output for the command !python model_main_tf2.py --model_dir=/content/training_demo/models/my_ssd_resnet101_v1_fpn --pipeline_config_path=/content/training_demo/models/my_ssd_resnet101_v1_fpn/pipeline.config


2022-03-04 22:11:27.391233: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:39] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0. INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0',) I0304 22:11:27.396766 140105387960192 mirrored_strategy.py:374] Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0',) INFO:tensorflow:Maybe overwriting train_steps: None I0304 22:11:27.400508 140105387960192 config_util.py:552] Maybe overwriting train_steps: None INFO:tensorflow:Maybe overwriting use_bfloat16: False I0304 22:11:27.400657 140105387960192 config_util.py:552] Maybe overwriting use_bfloat16: False WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/object_detection/model_lib_v2.py:564: StrategyBase.experimental_distribute_datasets_from_function (from tensorflow.python.distribute.distribute_lib) is deprecated and will be removed in a future version. Instructions for updating: rename to distribute_datasets_from_function W0304 22:11:27.425837 140105387960192 deprecation.py:343] From /usr/local/lib/python3.7/dist-packages/object_detection/model_lib_v2.py:564: StrategyBase.experimental_distribute_datasets_from_function (from tensorflow.python.distribute.distribute_lib) is deprecated and will be removed in a future version. Instructions for updating: rename to distribute_datasets_from_function INFO:tensorflow:Reading unweighted datasets: ['/content/training_demo/images/train'] I0304 22:11:27.429461 140105387960192 dataset_builder.py:163] Reading unweighted datasets: ['/content/training_demo/images/train'] INFO:tensorflow:Reading record datasets for input file: ['/content/training_demo/images/train'] I0304 22:11:27.429643 140105387960192 dataset_builder.py:80] Reading record datasets for input file: ['/content/training_demo/images/train'] INFO:tensorflow:Number of filenames to read: 1 I0304 22:11:27.429728 140105387960192 dataset_builder.py:81] Number of filenames to read: 1 WARNING:tensorflow:num_readers has been reduced to 1 to match input file shards. W0304 22:11:27.429801 140105387960192 dataset_builder.py:88] num_readers has been reduced to 1 to match input file shards. WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/object_detection/builders/dataset_builder.py:105: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.AUTOTUNE) instead. If sloppy execution is desired, use tf.data.Options.deterministic. W0304 22:11:27.432092 140105387960192 deprecation.py:343] From /usr/local/lib/python3.7/dist-packages/object_detection/builders/dataset_builder.py:105: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.AUTOTUNE) instead. If sloppy execution is desired, use tf.data.Options.deterministic. WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/object_detection/builders/dataset_builder.py:237: DatasetV1.map_with_legacy_function (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.data.Dataset.map() W0304 22:11:27.452108 140105387960192 deprecation.py:343] From /usr/local/lib/python3.7/dist-packages/object_detection/builders/dataset_builder.py:237: DatasetV1.map_with_legacy_function (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version. Instructions for updating: Usetf.data.Dataset.map() WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py:1082: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version. Instructions for updating: Create a tf.sparse.SparseTensor and use tf.sparse.to_dense instead. W0304 22:11:34.552799 140105387960192 deprecation.py:343] From /usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py:1082: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version. Instructions for updating: Create a tf.sparse.SparseTensor and use tf.sparse.to_dense instead. WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py:1082: sample_distorted_bounding_box (from tensorflow.python.ops.image_ops_impl) is deprecated and will be removed in a future version. Instructions for updating: seed2 arg is deprecated.Use sample_distorted_bounding_box_v2 instead. W0304 22:11:37.521306 140105387960192 deprecation.py:343] From /usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py:1082: sample_distorted_bounding_box (from tensorflow.python.ops.image_ops_impl) is deprecated and will be removed in a future version. Instructions for updating: seed2 arg is deprecated.Use sample_distorted_bounding_box_v2 instead. WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py:1082: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.cast instead. W0304 22:11:39.238127 140105387960192 deprecation.py:343] From /usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py:1082: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.cast instead. Traceback (most recent call last): File "model_main_tf2.py", line 113, in tf.compat.v1.app.run() File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/platform/app.py", line 36, in run _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef) File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 312, in run _run_main(main, args) File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 258, in _run_main sys.exit(main(argv)) File "model_main_tf2.py", line 110, in main record_summaries=FLAGS.record_summaries) File "/usr/local/lib/python3.7/dist-packages/object_detection/model_lib_v2.py", line 609, in train_loop train_input, unpad_groundtruth_tensors) File "/usr/local/lib/python3.7/dist-packages/object_detection/model_lib_v2.py", line 400, in load_fine_tune_checkpoint _ensure_model_is_built(model, input_dataset, unpad_groundtruth_tensors) File "/usr/local/lib/python3.7/dist-packages/object_detection/model_lib_v2.py", line 160, in _ensure_model_is_built features, labels = iter(input_dataset).next() File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/input_lib.py", line 549, in next return self.next() File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/input_lib.py", line 553, in next return self.get_next() File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/input_lib.py", line 610, in get_next return self._get_next_no_partial_batch_handling(name) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/input_lib.py", line 642, in _get_next_no_partial_batch_handling replicas.extend(self._iterators[i].get_next_as_list(new_name)) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/input_lib.py", line 1594, in get_next_as_list return self._format_data_list_with_options(self._iterator.get_next()) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/data/ops/multi_device_iterator_ops.py", line 580, in get_next result.append(self._device_iterators[i].get_next()) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/data/ops/iterator_ops.py", line 889, in get_next return self._next_internal() File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/data/ops/iterator_ops.py", line 822, in _next_internal output_shapes=self._flat_output_shapes) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/gen_dataset_ops.py", line 2923, in iterator_get_next _ops.raise_from_not_ok_status(e, name) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py", line 7186, in raise_from_not_ok_status raise core._status_to_exception(e) from None # pylint: disable=protected-access tensorflow.python.framework.errors_impl.FailedPreconditionError: /content/training_demo/images/train; Is a directory [[{{node MultiDeviceIteratorGetNextFromShard}}]] [[RemoteCall]] [Op:IteratorGetNext]

pindinagesh commented 2 years ago

Hi @clannoronha

From the error log we suspect that you are using some deprecated apis. Could you try with the latest TF version and please provide a complete code snippet if you still face an issue ? Thank you!

google-ml-butler[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you.

clannoronha commented 2 years ago

@pindinagesh Hey I was getting this error while training my model

!python model_main_tf2.py --model_dir=/content/training_demo/models/my_ssd_resnet101_v1_fpn --pipeline_config_path=/content/training_demo/models/my_ssd_resnet101_v1_fpn/pipeline.config

It is a object detection model and here is the source of the notebook.

https://colab.research.google.com/drive/1nmNnWz2Vbs9d2I62sBpaT-bEeMKsUbb1?usp=sharing

pindinagesh commented 2 years ago

Hi @clannoronha

Can you please share the access for colab gist, it will help us to analyze the issue. Thank you!

google-ml-butler[bot] commented 2 years ago

Closing as stale. Please reopen if you'd like to work on this further.

google-ml-butler[bot] commented 2 years ago

Are you satisfied with the resolution of your issue? Yes No