yangxue0827 / R-DFPN_FPN_Tensorflow

R-DFPN: Rotation Dense Feature Pyramid Networks (Tensorflow)
http://www.mdpi.com/2072-4292/10/1/132
120 stars 47 forks source link

fail to train #11

Closed Arthur-Shi closed 6 years ago

Arthur-Shi commented 6 years ago

2018-09-09 20:40:36: step1120 image_name:20180727008OK.bmp | rpn_loc_loss:0.0507980324328 | rpn_cla_loss:0.00724989641458 | rpn_total_loss:0.058047927916 | fast_rcnn_loc_loss:0.0158959124237 | fast_rcnn_cla_loss:0.00159495836124 | fast_rcnn_total_loss:0.0174908712506 | total_loss:0.835565567017 | per_cost_time:1.12856292725s Traceback (most recent call last): File "train.py", line 298, in train() File "train.py", line 259, in train fast_rcnn_total_loss, total_loss, train_op]) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 900, in run run_metadata_ptr) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1135, in _run feed_dict_tensor, options, run_metadata) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1316, in _do_run run_metadata) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1335, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.OutOfRangeError: PaddingFIFOQueue '_1_get_batch/batch/padding_fifo_queue' is closed and has insufficient elements (requested 1, current size 0) [[Node: get_batch/batch = QueueDequeueManyV2[component_types=[DT_STRING, DT_FLOAT, DT_INT32, DT_INT32], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](get_batch/batch/padding_fifo_queue, get_batch/batch/n)]]

Caused by op u'get_batch/batch', defined at: File "train.py", line 298, in train() File "train.py", line 36, in train is_training=True) File "/home/max/R-DFPN_FPN_Tensorflow/data/io/read_tfrecord.py", line 85, in next_batch dynamic_pad=True) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/input.py", line 988, in batch name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/input.py", line 762, in _batch dequeued = queue.dequeue_many(batch_size, name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/data_flow_ops.py", line 483, in dequeue_many self._queue_ref, n=n, component_types=self._dtypes, name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 3480, in queue_dequeue_many_v2 component_types=component_types, timeout_ms=timeout_ms, name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 3392, in create_op op_def=op_def) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1718, in init self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

OutOfRangeError (see above for traceback): PaddingFIFOQueue '_1_get_batch/batch/padding_fifo_queue' is closed and has insufficient elements (requested 1, current size 0) [[Node: get_batch/batch = QueueDequeueManyV2[component_types=[DT_STRING, DT_FLOAT, DT_INT32, DT_INT32], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](get_batch/batch/padding_fifo_queue, get_batch/batch/n)]]

I convert tfrecord successfully without segementation fault. But the problem appears as above.

What should I modify, thank you so much for ur help in advance. @yangxue0827

Arthur-Shi commented 6 years ago

I solved this problem by deleting some data which have strange names.