Open yonghuixu opened 5 years ago
Epoch: [0/20] step: [187/2] time: 0.309s, mse: 0.003
(218, 178) <== bug here, please check your data
(218, 178, 3)
我单独写了个文件对数据集进行了测试,发现所有的图片的shape全部是(218,178,3),并没有(218,178)的图片。
---原始邮件--- 发件人: "Hao"notifications@github.com 发送时间: 2019年7月29日(星期一) 晚上11:11 收件人: "tensorlayer/srgan"srgan@noreply.github.com; 抄送: "YonghuiXu"2259949930@qq.com;"Author"author@noreply.github.com; 主题: Re: [tensorlayer/srgan] 关于#164、#165的问题的进一步研究与发现: 是tf.data.Dataset.from_generator导致的问题,有没有替代方案? (#167)
Epoch: [0/20] step: [187/2] time: 0.309s, mse: 0.003 (218, 178) <== bug here, please check your data (218, 178, 3)
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
If your images are 3D, the APIs would not return 2D images ...
If it happen, I can't help ...
错误: Epoch: [1003/10] step: [1003/1] G_init time: 0.30186986923217773s, mse: 0.029376816004514694 2019-07-27 20:02:30.701958: W tensorflow/core/framework/op_kernel.cc:1431] OP_REQUIRES failed at iterator_ops.cc:988 : Invalid argument: Input shape axis 0 must equal 4, got shape [3] [[{{node crop_to_bounding_box/unstack}}]]
如果数据集很小,代码正常运行,但是我需要的数据集稍大,就会出现上面的错误。下面的说明是在数据集有9999张图片下运行的结果。
今天,通过一步一步的推敲,发现并不是batch_size的问题,而是tf.data.Dataset.from_generator(generator_train, output_types=(tf.float32,tf.float32))的问题,理由如下: 1.下面是我的generator_train()。通过倒推,在我的generator_train()输出imglr.shape,如下: def generator_train(): for imglr,imghr in zip(train_lr_imgs, train_hr_imgs): print(imglr.shape) yield imglr,imghr
Epoch: [0/20] step: [202/2] time: 0.293165922164917s, mse: 0.015779396519064903 (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) Epoch: [0/20] step: [203/2] time: 0.3003373146057129s, mse: 0.01940624974668026 (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178) (218, 178, 3) Epoch: [0/20] step: [204/2] time: 0.31006765365600586s, mse: 0.012748796492815018 (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) Epoch: [0/20] step: [205/2] time: 0.2964756488800049s, mse: 0.01310880295932293 (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) 2019-07-29 20:02:18.539915: W tensorflow/core/framework/op_kernel.cc:1431] OP_REQUIRES failed at iterator_ops.cc:988 : Invalid argument: Input shape axis 0 must equal 4, got shape [3] [[{{node crop_to_bounding_box/unstack}}]]
可以发现,在step==204(先输出shape,再输出Epoch: [0/20] step: [204/2])时,其中有一张图片的shape为(218,178),而不是(218,178,3)。(之所以出现Input shape axis 0 must equal 4, got shape [3],是因为我在_map_fn_train(imglr,imghr)用了tf.image.crop_to_bounding_box(),换成源码的tf.image.random_crop,报错为:Incompatible shapes: [2] vs. [3]。所以一定是因为这里的shape导致该错误。)所以我对该数据集中的所有图片进行了检查,输出第三个维度,发现全部都是3,所以我的数据集也是没有问题的。 然后我就想是不是我的zip导致了图片的shape发生了改变?由于无法直接验证,于是我就用源码的train.py(因为这里没用zip,仅仅改了lr和hr的size),结果报错如下: Epoch: [0/20] step: [185/2] time: 0.293s, mse: 0.003 (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3)ke (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) Epoch: [0/20] step: [186/2] time: 0.301s, mse: 0.003 (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) Epoch: [0/20] step: [187/2] time: 0.309s, mse: 0.003 (218, 178) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) (218, 178, 3) Epoch: [0/20] step: [188/2] time: 0.306s, mse: 0.005 (218, 178, 3) 2019-07-29 20:23:08.229313: W tensorflow/core/framework/op_kernel.cc:1431] OP_REQUIRES failed at iterator_ops.cc:988 : Invalid argument: Incompatible shapes: [2] vs. [3] [[{{node random_crop/GreaterEqual}}]] Traceback (most recent call last): File "new_train.py", line 204, in
(218, 178, 3)
train()
(218, 178, 3)
File "new_train.py", line 94, in train
for step, (lr_patchs, hr_patchs) in enumerate(train_ds):
(218, 178, 3)
File "/home/xyh/anaconda3/envs/tf2/lib/python3.7/site-packages/tensorflow/python/data/ops/iterator_ops.py", line 556, in next
return self.next()
File "/home/xyh/anaconda3/envs/tf2/lib/python3.7/site-packages/tensorflow/python/data/ops/iterator_ops.py", line 585, in next
return self._next_internal()
File "/home/xyh/anaconda3/envs/tf2/lib/python3.7/site-packages/tensorflow/python/data/ops/iterator_ops.py", line 577, in _next_internal
(218, 178, 3)
output_shapes=self._flat_output_shapes)
File "/home/xyh/anaconda3/envs/tf2/lib/python3.7/site-packages/tensorflow/python/ops/gen_dataset_ops.py", line 1954, in iterator_get_next_sync
_six.raise_from(_core._status_to_exception(e.code, message), None)
File "", line 3, in raise_from
(218, 178, 3)
可以发现,在step==188时,其中有一张图片的shape为(218,178),而不是(218,178,3)。报错里面也显示:Incompatible shapes: [2] vs. [3](和上面相同)。因此,也不是zip导致的该错误。
综上,只可能是tf.data.Dataset.from_generator()导致的该错误。(猜想可能图片稍多就会压缩一部分图片的shape。)所以想问一下各位大神,有没有代替tf.data.Dataset.from_generator()的方法?