tensorlayer / SRGAN

Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network
https://github.com/tensorlayer/tensorlayerx
3.29k stars 810 forks source link

Input shape axis 0 must equal 4, got shape [3] #167

Open yonghuixu opened 5 years ago

yonghuixu commented 5 years ago

错误: Epoch: [1003/10] step: [1003/1] G_init time: 0.30186986923217773s, mse: 0.029376816004514694 2019-07-27 20:02:30.701958: W tensorflow/core/framework/op_kernel.cc:1431] OP_REQUIRES failed at iterator_ops.cc:988 : Invalid argument: Input shape axis 0 must equal 4, got shape [3] [[{{node crop_to_bounding_box/unstack}}]]

如果数据集很小,代码正常运行,但是我需要的数据集稍大,就会出现上面的错误。下面的说明是在数据集有9999张图片下运行的结果。

今天,通过一步一步的推敲,发现并不是batch_size的问题,而是tf.data.Dataset.from_generator(generator_train, output_types=(tf.float32,tf.float32))的问题,理由如下: 1.下面是我的generator_train()。通过倒推,在我的generator_train()输出imglr.shape,如下: def generator_train(): for imglr,imghr in zip(train_lr_imgs, train_hr_imgs): print(imglr.shape) yield imglr,imghr

Epoch: [0/20] step: [202/2] time: 0.293165922164917s, mse: 0.015779396519064903 (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) Epoch: [0/20] step: [203/2] time: 0.3003373146057129s, mse: 0.01940624974668026 (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178) (218, 178, 3) Epoch: [0/20] step: [204/2] time: 0.31006765365600586s, mse: 0.012748796492815018 (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) Epoch: [0/20] step: [205/2] time: 0.2964756488800049s, mse: 0.01310880295932293 (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) 2019-07-29 20:02:18.539915: W tensorflow/core/framework/op_kernel.cc:1431] OP_REQUIRES failed at iterator_ops.cc:988 : Invalid argument: Input shape axis 0 must equal 4, got shape [3] [[{{node crop_to_bounding_box/unstack}}]]

可以发现,在step==204(先输出shape,再输出Epoch: [0/20] step: [204/2])时,其中有一张图片的shape为(218,178),而不是(218,178,3)。(之所以出现Input shape axis 0 must equal 4, got shape [3],是因为我在_map_fn_train(imglr,imghr)用了tf.image.crop_to_bounding_box(),换成源码的tf.image.random_crop,报错为:Incompatible shapes: [2] vs. [3]。所以一定是因为这里的shape导致该错误。)所以我对该数据集中的所有图片进行了检查,输出第三个维度,发现全部都是3,所以我的数据集也是没有问题的。 然后我就想是不是我的zip导致了图片的shape发生了改变?由于无法直接验证,于是我就用源码的train.py(因为这里没用zip,仅仅改了lr和hr的size),结果报错如下: Epoch: [0/20] step: [185/2] time: 0.293s, mse: 0.003 (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3)ke (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) Epoch: [0/20] step: [186/2] time: 0.301s, mse: 0.003 (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) Epoch: [0/20] step: [187/2] time: 0.309s, mse: 0.003 (218, 178) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) Epoch: [0/20] step: [188/2] time: 0.306s, mse: 0.005 (218, 178, 3) 2019-07-29 20:23:08.229313: W tensorflow/core/framework/op_kernel.cc:1431] OP_REQUIRES failed at iterator_ops.cc:988 : Invalid argument: Incompatible shapes: [2] vs. [3] [[{{node random_crop/GreaterEqual}}]] Traceback (most recent call last): File "new_train.py", line 204, in (218, 178, 3) train() (218, 178, 3) File "new_train.py", line 94, in train for step, (lr_patchs, hr_patchs) in enumerate(train_ds): (218, 178, 3) File "/home/xyh/anaconda3/envs/tf2/lib/python3.7/site-packages/tensorflow/python/data/ops/iterator_ops.py", line 556, in next return self.next() File "/home/xyh/anaconda3/envs/tf2/lib/python3.7/site-packages/tensorflow/python/data/ops/iterator_ops.py", line 585, in next return self._next_internal() File "/home/xyh/anaconda3/envs/tf2/lib/python3.7/site-packages/tensorflow/python/data/ops/iterator_ops.py", line 577, in _next_internal (218, 178, 3) output_shapes=self._flat_output_shapes) File "/home/xyh/anaconda3/envs/tf2/lib/python3.7/site-packages/tensorflow/python/ops/gen_dataset_ops.py", line 1954, in iterator_get_next_sync _six.raise_from(_core._status_to_exception(e.code, message), None) File "", line 3, in raise_from (218, 178, 3) 可以发现,在step==188时,其中有一张图片的shape为(218,178),而不是(218,178,3)。报错里面也显示:Incompatible shapes: [2] vs. [3](和上面相同)。因此,也不是zip导致的该错误。 综上,只可能是tf.data.Dataset.from_generator()导致的该错误。(猜想可能图片稍多就会压缩一部分图片的shape。)所以想问一下各位大神,有没有代替tf.data.Dataset.from_generator()的方法?

zsdonghao commented 5 years ago
Epoch: [0/20] step: [187/2] time: 0.309s, mse: 0.003
(218, 178)  <== bug here, please check your data
(218, 178, 3)
yonghuixu commented 5 years ago

我单独写了个文件对数据集进行了测试,发现所有的图片的shape全部是(218,178,3),并没有(218,178)的图片。

---原始邮件--- 发件人: "Hao"notifications@github.com 发送时间: 2019年7月29日(星期一) 晚上11:11 收件人: "tensorlayer/srgan"srgan@noreply.github.com; 抄送: "YonghuiXu"2259949930@qq.com;"Author"author@noreply.github.com; 主题: Re: [tensorlayer/srgan] 关于#164、#165的问题的进一步研究与发现: 是tf.data.Dataset.from_generator导致的问题,有没有替代方案? (#167)

Epoch: [0/20] step: [187/2] time: 0.309s, mse: 0.003 (218, 178) <== bug here, please check your data (218, 178, 3)
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

zsdonghao commented 5 years ago

If your images are 3D, the APIs would not return 2D images ...
If it happen, I can't help ...