tuple shape mismatch error while training

Srijita173 commented 6 years ago

before d gen 2 4 2048 before d real 2 4 2048 2018-07-03 11:11:01.074867: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 2018-07-03 11:11:01.117290: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties: name: Quadro K420 major: 3 minor: 0 memoryClockRate(GHz): 0.8755 pciBusID: 0000:03:00.0 totalMemory: 1.94GiB freeMemory: 1.35GiB 2018-07-03 11:11:01.117324: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1406] Ignoring visible gpu device (device: 0, name: Quadro K420, pci bus id: 0000:03:00.0, compute capability: 3.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5. 2018-07-03 11:11:01.117336: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix: 2018-07-03 11:11:01.117346: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929] 0 2018-07-03 11:11:01.117353: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0: N 2018-07-03 11:11:02.343335: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at queue_ops.cc:105 : Invalid argument: Shape mismatch in tuple component 1. Expected [128,128,1], got [128,128,3] 2018-07-03 11:11:02.412697: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at queue_ops.cc:105 : Invalid argument: Shape mismatch in tuple component 1. Expected [128,128,1], got [128,128,3] 2018-07-03 11:11:02.428441: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at queue_ops.cc:105 : Invalid argument: Shape mismatch in tuple component 1. Expected [128,128,1], got [128,128,3] 2018-07-03 11:11:02.428441: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at queue_ops.cc:105 : Invalid argument: Shape mismatch in tuple component 1. Expected [128,128,1], got [128,128,3] 2018-07-03 11:11:02.429115: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at queue_ops.cc:105 : Invalid argument: Shape mismatch in tuple component 1. Expected [128,128,1], got [128,128,3] 2018-07-03 11:11:02.437389: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at queue_ops.cc:105 : Invalid argument: Shape mismatch in tuple component 1. Expected [128,128,1], got [128,128,3] 2018-07-03 11:11:02.437405: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at queue_ops.cc:105 : Invalid argument: Shape mismatch in tuple component 1. Expected [128,128,1], got [128,128,3] 2018-07-03 11:11:02.441379: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at queue_ops.cc:105 : Invalid argument: Shape mismatch in tuple component 1. Expected [128,128,1], got [128,128,3] 2018-07-03 11:11:02.449401: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at queue_ops.cc:105 : Invalid argument: Shape mismatch in tuple component 1. Expected [128,128,1], got [128,128,3] 2018-07-03 11:11:02.451722: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at queue_ops.cc:105 : Invalid argument: Shape mismatch in tuple component 1. Expected [128,128,1], got [128,128,3] 2018-07-03 11:11:02.457215: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at queue_ops.cc:105 : Invalid argument: Shape mismatch in tuple component 1. Expected [128,128,1], got [128,128,3] 2018-07-03 11:11:02.457665: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at queue_ops.cc:105 : Invalid argument: Shape mismatch in tuple component 1. Expected [128,128,1], got [128,128,3] 2018-07-03 11:11:02.458851: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at queue_ops.cc:105 : Invalid argument: Shape mismatch in tuple component 1. Expected [128,128,1], got [128,128,3] 2018-07-03 11:11:02.459306: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at queue_ops.cc:105 : Invalid argument: Shape mismatch in tuple component 1. Expected [128,128,1], got [128,128,3] 2018-07-03 11:11:03.544010: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at queue_ops.cc:105 : Invalid argument: Shape mismatch in tuple component 1. Expected [128,128,1], got [128,128,3] 2018-07-03 11:11:03.636651: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at queue_ops.cc:105 : Invalid argument: Shape mismatch in tuple component 1. Expected [128,128,1], got [128,128,3] Traceback (most recent call last): File "train.py", line 158, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 126, in run sys.exit(main(argv)) File "train.py", line 154, in main train() File "train.py", line 89, in train , g_loss_value, d_loss_value = sess.run([train_op, dcgan.losses['g'], dcgan.losses['d']]) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 900, in run run_metadata_ptr) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1135, in _run feed_dict_tensor, options, run_metadata) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1316, in _do_run run_metadata) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1335, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.OutOfRangeError: RandomShuffleQueue '_2_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 64, current size 0) [[Node: shuffle_batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](shuffle_batch/random_shuffle_queue, shuffle_batch/n)]]

Caused by op u'shuffle_batch', defined at: File "train.py", line 158, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 126, in run _sys.exit(main(argv)) File "train.py", line 154, in main train() File "train.py", line 54, in train images, segmentation_images = load_image_and_segmentation_from_idlist(idlist_tensor_img, idlist_tensor_seg, batch_size, 16, 2560, shift_params = [-128, -0.5], rescale_params = [128, 0.5], shuffle = True) File "/home/nikhitha/Srijita_work/Image_Preprocess/Generative-Adversarial-Network-based-Synthesis-for-Supervised-Medical-Image-Segmentation-master/load_folder_images.py", line 84, in load_image_and_segmentation_from_idlist min_after_dequeue=min_queue_examples) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/input.py", line 1300, in shuffle_batch name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/input.py", line 846, in _shuffle_batch dequeued = queue.dequeue_many(batch_size, name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/data_flow_ops.py", line 483, in dequeue_many self._queue_ref, n=n, component_types=self._dtypes, name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 3480, in queue_dequeue_many_v2 component_types=component_types, timeout_ms=timeout_ms, name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 3392, in create_op op_def=op_def) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1718, in init self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

OutOfRangeError (see above for traceback): RandomShuffleQueue '_2_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 64, current size 0) [[Node: shuffle_batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](shuffle_batch/random_shuffle_queue, shuffle_batch/n)]]

Hi , I am getting this tuple shape mismatch error , can you please provide me with the solution to this ? It is urgent .

thomasneff commented 6 years ago

Hi,

based on your error log, I suspect that either your images or segmentation masks are not single-channel, but are instead stored as RGB images. You will have to either change the network architecture to handle RGB images (and remove the set_shape call in load_folder_images) or make sure your images are single-channel via tensorflow functions or manual preprocessing. (This needs to be done before the set_shape call in decode_and_preprocess_image).

Additionally, it seems like your GPU is ignored because its compute capability is too low - I have never tested/run this on a CPU, so you might encounter operations that are not supported on the CPU and it might just not work at all.

aditipanda commented 5 years ago

@thomasneff

I did not get a tuple mismatch error, but I did get a RandomShuffleQueue error. I posted about this in another issue to your code.

Here, you mentioned that images need to be single channel or the code needs to be altered to handle RGB images. In my case, images are single-channel. They are taken from the ISBI EM 2012 challenge and are gray scale images with dimensions 512-by-512.

Then why am I getting the error? Please read about my issue in https://github.com/thomasneff/Generative-Adversarial-Network-based-Synthesis-for-Supervised-Medical-Image-Segmentation/issues/7

thomasneff commented 5 years ago

Hi,

your referenced issue sounds similar to #1 , where the file names/folder names were not set to what the code expects. You can probably verify if this is the case by stepping through the initial part where the shuffle queues are setup from the file paths and checking if the files are found correctly. (e.g. check if string_tensor_from_idlist_and_path returns a valid string tensor of filenames and/or check if the filenames are parsed correctly.)

aditipanda commented 5 years ago

Hi @thomasneff I looked into https://github.com/thomasneff/Generative-Adversarial-Network-based-Synthesis-for-Supervised-Medical-Image-Segmentation/issues/1 and followed it thoroughly. I have done everything that is asked for there, except that my files are in jpg instead of png.

So I converted my files to PNG and tried again. But I'm getting the same error. I printed the file names in the function string_tensor_from_idlist_and_path defined in util.py to see if correct names are being read. This is the error:

['./train-images/train-image-0.png', './train-images/train-image-1.png', './train-images/train-image-2.png', './train-images/train-image-3.png', './train-images/train-image-4.png', './train-images/train-image-5.png', './train-images/train-image-6.png', './train-images/train-image-7.png', './train-images/train-image-8.png', './train-images/train-image-9.png', './train-images/train-image-10.png', './train-images/train-image-11.png', './train-images/train-image-12.png', './train-images/train-image-13.png', './train-images/train-image-14.png', './train-images/train-image-15.png', './train-images/train-image-16.png', './train-images/train-image-17.png', './train-images/train-image-18.png', './train-images/train-image-19.png', './train-images/train-image-20.png', './train-images/train-image-21.png', './train-images/train-image-22.png', './train-images/train-image-23.png', './train-images/train-image-24.png', './train-images/train-image-25.png', './train-images/train-image-26.png', './train-images/train-image-27.png', './train-images/train-image-28.png', './train-images/train-image-29.png'] ['./train-labels/train-label-0.png', './train-labels/train-label-1.png', './train-labels/train-label-2.png', './train-labels/train-label-3.png', './train-labels/train-label-4.png', './train-labels/train-label-5.png', './train-labels/train-label-6.png', './train-labels/train-label-7.png', './train-labels/train-label-8.png', './train-labels/train-label-9.png', './train-labels/train-label-10.png', './train-labels/train-label-11.png', './train-labels/train-label-12.png', './train-labels/train-label-13.png', './train-labels/train-label-14.png', './train-labels/train-label-15.png', './train-labels/train-label-16.png', './train-labels/train-label-17.png', './train-labels/train-label-18.png', './train-labels/train-label-19.png', './train-labels/train-label-20.png', './train-labels/train-label-21.png', './train-labels/train-label-22.png', './train-labels/train-label-23.png', './train-labels/train-label-24.png', './train-labels/train-label-25.png', './train-labels/train-label-26.png', './train-labels/train-label-27.png', './train-labels/train-label-28.png', './train-labels/train-label-29.png'] Reading images and labels... Printing size of images and labels... (128, 128, 128, 1) (128, 128, 128, 1) before d gen 2 4 2048 before d real 2 4 2048 2018-09-04 22:45:44.587567: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 2018-09-04 22:45:44.660586: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2018-09-04 22:45:44.660867: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1344] Found device 0 with properties: name: GeForce GTX 980 Ti major: 5 minor: 2 memoryClockRate(GHz): 1.076 pciBusID: 0000:01:00.0 totalMemory: 5.93GiB freeMemory: 5.57GiB 2018-09-04 22:45:44.660881: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1423] Adding visible gpu devices: 0 2018-09-04 22:45:44.839717: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix: 2018-09-04 22:45:44.839746: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917] 0 2018-09-04 22:45:44.839751: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0: N 2018-09-04 22:45:44.839924: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5346 MB memory) -> physical GPU (device: 0, name: GeForce GTX 980 Ti, pci bus id: 0000:01:00.0, compute capability: 5.2) Traceback (most recent call last): File "train.py", line 156, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 126, in run sys.exit(main(argv)) File "train.py", line 152, in main train() File "train.py", line 87, in train , g_loss_value, d_loss_value = sess.run([train_op, dcgan.losses['g'], dcgan.losses['d']]) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 905, in run run_metadata_ptr) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1140, in _run feed_dict_tensor, options, run_metadata) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1321, in _do_run run_metadata) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1340, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.OutOfRangeError: RandomShuffleQueue '_2_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 128, current size 0) [[Node: shuffle_batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](shuffle_batch/random_shuffle_queue, shuffle_batch/n)]]

Caused by op u'shuffle_batch', defined at: File "train.py", line 156, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 126, in run _sys.exit(main(argv)) File "train.py", line 152, in main train() File "train.py", line 51, in train images, segmentation_images = load_image_and_segmentation_from_idlist(idlist_tensor_img, idlist_tensor_seg, batch_size, 16, 2560, shift_params = [-128, -0.5], rescale_params = [128, 0.5], shuffle = True) File "/home/aditi/Documents/PhD/Steel_Seg/Data/Ground-Truth-Generation/GAN_trained_to_get_GT/Generative-Adversarial-Network-based-Synthesis-for-Supervised-Medical-Image-Segmentation-master/load_folder_images.py", line 92, in load_image_and_segmentation_from_idlist min_after_dequeue=min_queue_examples) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/input.py", line 1301, in shuffle_batch name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/input.py", line 847, in _shuffle_batch dequeued = queue.dequeue_many(batch_size, name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/data_flow_ops.py", line 483, in dequeue_many self._queue_ref, n=n, component_types=self._dtypes, name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 3476, in queue_dequeue_many_v2 component_types=component_types, timeout_ms=timeout_ms, name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 3290, in create_op op_def=op_def) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1654, in init self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

OutOfRangeError (see above for traceback): RandomShuffleQueue '_2_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 128, current size 0) [[Node: shuffle_batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](shuffle_batch/random_shuffle_queue, shuffle_batch/n)]]

In train.py, the following is defined: idlist_img_name = "list_of_img_ids.txt" idlist_seg_name = "list_of_seg_ids.txt" img_folder_path = "./train-images/" seg_folder_path = "./train-labels/"

I have added "/" at the end of folder_path variables. Still I am getting the same error. Is there any other variable that I could print out and check to see if images are being read properly? How do I solve this? Please help.

thomasneff commented 5 years ago

Hi,

try using absolute paths (i.e. /home/.../.../train-images/) instead of relative paths - it's been a while so I don't know how Linux/string_tensor_from_idlist_and_path expects the paths.

aditipanda commented 5 years ago

Same error :( @thomasneff

I know I'm making some stupid mistake but don't know what.

img_folder_path = "/home/user/Documents/PhD/Steel_Seg/train-images/" seg_folder_path = "/home/user/Documents/PhD/Steel_Seg/train-labels/"

thomasneff commented 5 years ago

Hi,

then I'm not sure what else to suggest - the error ( "(requested 128, current size 0)") definitely suggests that something with the image loading isn't working.

You could try to load an image using the paths you're outputting manually and see if this works - similar to what is done in load_image in load_folder_images.py.

For example, try to take one of your file paths that you outputted before, and use a tf.WholeFileReader to read it. Check if this works. Then check if tf.image.decode_png works for this file.

aditipanda commented 5 years ago

@thomasneff I could get it running finally. There was error in file names. Apparently, if the name does not contain ".png", then ids.txt file should also not contain ".png".

Chnaged that one thing and now it's training.

Thank you so much for your prompt response. Much appreciated :) I will bug you again if I face more problems with this code base. Thanks again !

Ng-ms commented 5 years ago

@appyfizzA i am having the same error and when i delete the path for .txt file and just put the name , this error rise idlist_img_name = 'imagetxt.txt' idlist_seg_name = 'labeltext.txt' IOError: [Errno 2] No such file or directory: 'imagetxt.txt' what did you do to make it work

pkraison commented 3 years ago

Hi,

based on your error log, I suspect that either your images or segmentation masks are not single-channel, but are instead stored as RGB images. You will have to either change the network architecture to handle RGB images (and remove the set_shape call in load_folder_images) or make sure your images are single-channel via tensorflow functions or manual preprocessing. (This needs to be done before the set_shape call in decode_and_preprocess_image).

Additionally, it seems like your GPU is ignored because its compute capability is too low - I have never tested/run this on a CPU, so you might encounter operations that are not supported on the CPU and it might just not work at all.

Hi @thomasneff , can you help how can we make it run with rgb images. What changes we need exactly to get the network working for rgb

thomasneff / Generative-Adversarial-Network-based-Synthesis-for-Supervised-Medical-Image-Segmentation

tuple shape mismatch error while training #6