irolaina / FCRN-DepthPrediction

Deeper Depth Prediction with Fully Convolutional Residual Networks (FCRN)
BSD 2-Clause "Simplified" License
1.11k stars 313 forks source link

NYUDepth Training Not Working #74

Closed lpqdao closed 5 years ago

lpqdao commented 6 years ago

Hello,

I am running your predict_nick code to train the nyudepth dataset. I've downloaded the dataset looking at nyudepth.py by converting the .mat file to .png files. However, when I run the code, this is what I received.

[fcrn] Selected Params:

Namespace(batch_size=4, data_aug=False, data_path='~/tensorflow/FCRN-DepthPrediction/tensorflow/modules/datasets', dataset='nyudepth', dropout=0.5, full_summary=False, gpu='0', image_path='', l2norm=True, ldecay=True, learning_rate=0.0001, log_directory='log_tb/', loss='berhu', machine='grace', max_steps=75000, mode='train', model_name='fcrn', model_path='', output_directory='', px='all', remove_sky=False, show_test_results=False, show_train_progress=True, show_valid_progress=True)

[fcrn] Selected mode: Train nyudepth NYU Dataset selected [Dataloader] NyuDepth object created.

[Dataloader] Loading 'data/nyudepth_train.txt'... (795, 2) time: 0.00011992454528808594 s

Summary - TrainData image_filenames: 795 depth_filenames: 795

[Dataloader] Loading 'data/nyudepth_test.txt'... (654, 2) time: 8.034706115722656e-05 s

Summary - TestData (Validation Set) image_filenames: 654 depth_filenames: 654

[Dataloader] dataloader object created. nyudepth ~/tensorflow/FCRN-DepthPrediction/tensorflow/modules/datasets/ 480 640 3 480 640 1 <class 'list'> tf image filenames Tensor("Const_4:0", shape=(795,), dtype=string) tf depth filenames Tensor("Const_5:0", shape=(795,), dtype=string) tf train input queue [<tf.Tensor 'input_producer/GatherV2:0' shape=() dtype=string>, <tf.Tensor 'input_producer/GatherV2_1:0' shape=() dtype=string>] tf image file.---------------------- Tensor("ReadFile:0", shape=(), dtype=string) tf depth file ----------------------- Tensor("ReadFile_1:0", shape=(), dtype=string) tf_image_key: Tensor("Print_1:0", shape=(), dtype=string) tf_depth_key: Tensor("input_producer/GatherV2_1:0", shape=(), dtype=string) tf_image_file: Tensor("Print_2:0", shape=(), dtype=string) tf_depth_file: Tensor("ReadFile_1:0", shape=(), dtype=string) tf_image: Tensor("DecodePng:0", shape=(?, ?, 3), dtype=uint8) tf_depth: Tensor("DecodePng_1:0", shape=(?, ?, 1), dtype=uint16) tf_image_shape: Tensor("Shape:0", shape=(3,), dtype=int32) tf_depth_shape: Tensor("Shape_1:0", shape=(3,), dtype=int32)

[Network/Model] Build Network Model...

[Network/Train] Training Tensors created. Tensor("model/Input/batch_1:0", shape=(4, 228, 304, 3), dtype=float32) Tensor("model/Input/batch_1:1", shape=(4, 228, 304, 3), dtype=uint8) Tensor("model/Input/batch_1:2", shape=(4, 128, 160, 1), dtype=float32) <tf.Variable 'model/Train/global_step:0' shape=() dtype=int32_ref> Tensor("model/Train/learning_rate_1:0", shape=(), dtype=float32)

DEBUG------------------------- Tensor("model_1/ReadFile:0", shape=(), dtype=string) DEBUG--------------------------- Tensor("model_1/DecodePng:0", shape=(?, ?, 3), dtype=uint8) [Network/Validation] Validation Tensors created. Tensor("model_1/convert_image:0", shape=(?, ?, 3), dtype=float32) Tensor("model_1/truediv:0", shape=(?, ?, 1), dtype=float32) Tensor("model_1/resize_images/Squeeze:0", shape=(228, 304, 3), dtype=float32) Tensor("model_1/convert_image_1:0", shape=(228, 304, 3), dtype=uint8) Tensor("model_1/resize_images_1/Squeeze:0", shape=(128, 160, 1), dtype=float32)

[Network/Loss] Loss: All Pixels [Network/Loss] Loss Function: BerHu [Network/Model] Number of trainable parameters: 63572737

Train with approximately 377 epochs

[Network/Training] Initializing graph's variables... [Network/Training] Training Initialized!

Traceback (most recent call last): File "/home/grace/tensorflow/venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1322, in _do_call return fn(*args) File "/home/grace/tensorflow/venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1307, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "/home/grace/tensorflow/venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1409, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.OutOfRangeError: FIFOQueue '_2_model/Input/batch_1/fifo_queue' is closed and has insufficient elements (requested 4, current size 0) [[Node: model/Input/batch_1 = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_UINT8, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](model/Input/batch_1/fifo_queue, model/Input/batch/n)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "predict_nick.py", line 585, in tf.app.run(main=main(args)) File "predict_nick.py", line 566, in main train(args) File "predict_nick.py", line 308, in train model.tf_summary_train_loss]) File "/home/grace/tensorflow/venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 900, in run run_metadata_ptr) File "/home/grace/tensorflow/venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1135, in _run feed_dict_tensor, options, run_metadata) File "/home/grace/tensorflow/venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run run_metadata) File "/home/grace/tensorflow/venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.OutOfRangeError: FIFOQueue '_2_model/Input/batch_1/fifo_queue' is closed and has insufficient elements (requested 4, current size 0) [[Node: model/Input/batch_1 = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_UINT8, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](model/Input/batch_1/fifo_queue, model/Input/batch/n)]]

Caused by op 'model/Input/batch_1', defined at: File "predict_nick.py", line 585, in tf.app.run(main=main(args)) File "predict_nick.py", line 566, in main train(args) File "predict_nick.py", line 258, in train model = Model(args, data) File "/home/grace/tensorflow/FCRN-DepthPrediction/tensorflow/modules/framework.py", line 50, in init self.build_model(data) File "/home/grace/tensorflow/FCRN-DepthPrediction/tensorflow/modules/framework.py", line 64, in build_model self.train = Train(self.args, data.tf_train_image_key, data.tf_train_image, data.tf_train_depth_key, data.tf_train_depth, self.input_size, self.output_size, data.datasetObj.max_depth, data.dataset_name, self.args.data_aug) File "/home/grace/tensorflow/FCRN-DepthPrediction/tensorflow/modules/train.py", line 95, in init tf_batch_image_resized, tf_batch_image_resized_uint8, tf_batch_depth_resized = tf.train.batch([self.tf_image_resized, self.tf_image_resized_uint8, self.tf_depth_resized], batch_size, num_threads, capacity, shapes=[input_size.getSize(), input_size.getSize(), output_size.getSize()]) File "/home/grace/tensorflow/venv/lib/python3.6/site-packages/tensorflow/python/training/input.py", line 988, in batch name=name) File "/home/grace/tensorflow/venv/lib/python3.6/site-packages/tensorflow/python/training/input.py", line 762, in _batch dequeued = queue.dequeue_many(batch_size, name=name) File "/home/grace/tensorflow/venv/lib/python3.6/site-packages/tensorflow/python/ops/data_flow_ops.py", line 483, in dequeue_many self._queue_ref, n=n, component_types=self._dtypes, name=name) File "/home/grace/tensorflow/venv/lib/python3.6/site-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 3480, in queue_dequeue_many_v2 component_types=component_types, timeout_ms=timeout_ms, name=name) File "/home/grace/tensorflow/venv/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/home/grace/tensorflow/venv/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3414, in create_op op_def=op_def) File "/home/grace/tensorflow/venv/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1740, in init self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

OutOfRangeError (see above for traceback): FIFOQueue '_2_model/Input/batch_1/fifo_queue' is closed and has insufficient elements (requested 4, current size 0) [[Node: model/Input/batch_1 = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_UINT8, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](model/Input/batch_1/fifo_queue, model/Input/batch/n)]]

Have you encountered this problem? If so, do you know what is wrong and how to fix it?

Thank you in advance, Grace

chrirupp commented 5 years ago

nope