Closed phexic closed 4 years ago
Hi @phexic,
Based on the logs, it seems that your model is diverging, most likely because of the very small batch size. You can try to increase it by reducing the size of the training patches.
Thank you for your prompt reply! I did try batchsize 24 by using 2 V100(32g), howerver the same problem occured that mse(0.3079) did not change and psnr remained 6.6111. Also, during training epoch 1, mse suddenly increases and training result suddenly turns completely black(about from 700+ batchidx). By print hidden layer gradient,It seems gradient explodes (NAN). At the same time, I have checked my custom training data, all training pairs's mse below 0.07 which is within the normal range. What do you think might be the cause? Btw, line 58 in load_data.py
dslr_image = np.float32(misc.imresize(dslr_image, self.scale / 2.0)) / 255.0
Is it not necessary that self.scale / 2.0 because network in paper "Replacing Mobile Camera ISP with a Single Deep Learning Model " only has 4 downsample layer differs from "Bokeh..".
Hi @phexic , did you fix this problem? I got similar results: Epoch 0, mse: 0.2329, psnr: 6.3279, ms-ssim: 0.5204 Epoch 1, mse: 0.0000, psnr: 54.1944, ms-ssim: 0.9943 Epoch 2, mse: 0.0000, psnr: 54.1944, ms-ssim: 0.9943 Epoch 3, mse: 0.0000, psnr: 54.1944, ms-ssim: 0.9943 Epoch 4, mse: 0.0000, psnr: 54.1944, ms-ssim: 0.9943 Epoch 5, mse: 0.0000, psnr: 54.1944, ms-ssim: 0.9943 Epoch 6, mse: 0.0000, psnr: 54.1944, ms-ssim: 0.9943
I met the problem which Psnr and Mse not changed during level3's training.
and visual results are all black, how it happened?