Error when train with new data

ducanh841988 commented 7 years ago

I tried to train the Enet with own data but it caused a error. It is a problem with FIFOQueue. Our data have 715 images for training and 105 for validation and 100 for testing.

Can you help me to figure out this problem. Thanks a lot. Anh


2017-08-15 16:48:45.882942: W tensorflow/core/framework/op_kernel.cc:1158] Out of range: FIFOQueue '_1_batch/fifo_queue' is closed and has insufficient elements (requested 5, current size 0)
     [[Node: batch = QueueDequeueUpToV2[component_types=[DT_FLOAT, DT_UINT8], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](batch/fifo_queue, batch/n)]]
2017-08-15 16:48:45.883176: W tensorflow/core/framework/op_kernel.cc:1158] Out of range: FIFOQueue '_1_batch/fifo_queue' is closed and has insufficient elements (requested 5, current size 0)
     [[Node: batch = QueueDequeueUpToV2[component_types=[DT_FLOAT, DT_UINT8], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](batch/fifo_queue, batch/n)]]
2017-08-15 16:48:45.883198: W tensorflow/core/framework/op_kernel.cc:1158] Out of range: FIFOQueue '_1_batch/fifo_queue' is closed and has insufficient elements (requested 5, current size 0)
     [[Node: batch = QueueDequeueUpToV2[component_types=[DT_FLOAT, DT_UINT8], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](batch/fifo_queue, batch/n)]]
2017-08-15 16:48:45.883214: W tensorflow/core/framework/op_kernel.cc:1158] Out of range: FIFOQueue '_1_batch/fifo_queue' is closed and has insufficient elements (requested 5, current size 0)
     [[Node: batch = QueueDequeueUpToV2[component_types=[DT_FLOAT, DT_UINT8], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](batch/fifo_queue, batch/n)]]
2017-08-15 16:48:45.883861: W tensorflow/core/framework/op_kernel.cc:1158] Out of range: FIFOQueue '_1_batch/fifo_queue' is closed and has insufficient elements (requested 5, current size 0)
     [[Node: batch = QueueDequeueUpToV2[component_types=[DT_FLOAT, DT_UINT8], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](batch/fifo_queue, batch/n)]]
2017-08-15 16:48:45.886415: W tensorflow/core/framework/op_kernel.cc:1158] Out of range: FIFOQueue '_1_batch/fifo_queue' is closed and has insufficient elements (requested 5, current size 0)
     [[Node: batch = QueueDequeueUpToV2[component_types=[DT_FLOAT, DT_UINT8], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](batch/fifo_queue, batch/n)]]
Traceback (most recent call last):
  File "train_enet.py", line 357, in <module>
    run()
  File "train_enet.py", line 172, in run
    with slim.arg_scope(ENet_arg_scope(weight_decay=weight_decay)):
  File "/home/leducanh/.pyenv/versions/anaconda3-2.5.0/envs/tensorflow120/lib/python2.7/contextlib.py", line 35, in __exit__
    self.gen.throw(type, value, traceback)
  File "/home/leducanh/.pyenv/versions/anaconda3-2.5.0/envs/tensorflow120/lib/python2.7/site-packages/tensorflow/python/training/supervisor.py", line 964, in managed_session
    self.stop(close_summary_writer=close_summary_writer)
  File "/home/leducanh/.pyenv/versions/anaconda3-2.5.0/envs/tensorflow120/lib/python2.7/site-packages/tensorflow/python/training/supervisor.py", line 792, in stop
    stop_grace_period_secs=self._stop_grace_secs)
  File "/home/leducanh/.pyenv/versions/anaconda3-2.5.0/envs/tensorflow120/lib/python2.7/site-packages/tensorflow/python/training/coordinator.py", line 389, in join
    six.reraise(*self._exc_info_to_raise)
  File "/home/leducanh/.pyenv/versions/anaconda3-2.5.0/envs/tensorflow120/lib/python2.7/site-packages/tensorflow/python/training/queue_runner_impl.py", line 238, in _run
    enqueue_callable()
  File "/home/leducanh/.pyenv/versions/anaconda3-2.5.0/envs/tensorflow120/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1063, in _single_operation_run
    target_list_as_strings, status, None)
  File "/home/leducanh/.pyenv/versions/anaconda3-2.5.0/envs/tensorflow120/lib/python2.7/contextlib.py", line 24, in __exit__
    self.gen.next()
  File "/home/leducanh/.pyenv/versions/anaconda3-2.5.0/envs/tensorflow120/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Shape mismatch in tuple component 1. Expected [202,360,1], got [202,360,3]
     [[Node: batch/fifo_queue_enqueue = QueueEnqueueV2[Tcomponents=[DT_FLOAT, DT_UINT8], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](batch/fifo_queue, Squeeze/_2975, Squeeze_1/_2977)]]

kwotsin commented 7 years ago

As the error implies, your FIFO queue is empty. Could you post the part where you enqueue the data? You should make sure the list you feed into the tf.convert_to_tensor function is not empty. The later error shows the shape mismatch, which has to do with your input data channels.

ducanh841988 commented 7 years ago

this is the code for enqueue training data

    images = tf.convert_to_tensor(image_files)
    import pdb; pdb.set_trace()
    annotations = tf.convert_to_tensor(annotation_files)
    input_queue = tf.train.slice_input_producer([images, annotations]) #Slice_input producer shuffles the data by default.
    print "Input Queue ", input_queue
    #Decode the image and annotation raw content
    image = tf.read_file(input_queue[0])
    image = tf.image.decode_image(image, channels=3)
    #import pdb; pdb.set_trace()
    annotation = tf.read_file(input_queue[1])
    annotation = tf.image.decode_image(annotation)

    #preprocess and batch up the image and annotation
    preprocessed_image, preprocessed_annotation = preprocess(image, annotation, image_height, image_width)
    images, annotations = tf.train.batch([preprocessed_image, preprocessed_annotation], batch_size=batch_size, allow_smaller_final_batch=True)

and this for validation data

   #Load the files into one input queue
    images_val = tf.convert_to_tensor(image_val_files)
    annotations_val = tf.convert_to_tensor(annotation_val_files)
    input_queue_val = tf.train.slice_input_producer([images_val, annotations_val], num_epochs=1)

    #Decode the image and annotation raw content
    image_val = tf.read_file(input_queue_val[0])
    image_val = tf.image.decode_jpeg(image_val, channels=3)
    annotation_val = tf.read_file(input_queue_val[1])
    annotation_val = tf.image.decode_png(annotation_val)

    #preprocess and batch up the image and annotation
    preprocessed_image_val, preprocessed_annotation_val = preprocess(image_val, annotation_val, image_height, image_width)
    images_val, annotations_val = tf.train.batch([preprocessed_image_val, preprocessed_annotation_val], batch_size=eval_batch_size, allow_smaller_final_batch=True)

Note that I used this code to run camvid data. It worked well.

kwotsin commented 7 years ago

@ducanh841988 I edited your error code (it is easier to look this way). So if you look at the error: Shape mismatch in tuple component 1. Expected [202,360,1], got [202,360,3] you might be feeding in an RGB image instead of a grayscale image annotation. The ground truth annotation must be a grayscale image where each pixel is labelled with its class and not the RGB values.

rocklinsuv commented 5 years ago

@kwotsin Hi kwotsin,

I recently have encountered a similar issue of shape mismatch. The annotation tool I used is Labelme, which produces a 3-channel png labelled image. Will a simple color2grayscale conversion (like the opencv rgb2gray function) enough to resolve this issue? Thanks!

SreenijaK commented 4 years ago

@kwotsin Hi kwotsin,

I recently have encountered a similar issue of shape mismatch. The annotation tool I used is Labelme, which produces a 3-channel png labelled image. Will a simple color2grayscale conversion (like the opencv rgb2gray function) enough to resolve this issue? Thanks!

@kwotsin were you able to solve the issue. did rgb2gray work for you?

kwotsin / TensorFlow-ENet

Error when train with new data #6