zhenngbolun / Learnbale_Bandpass_Filter

Image Demoireing with Learnable Bandpass Filters. (CVPR, 2020)(Keras+TensorFlow)
175 stars 31 forks source link

What's the shape of IFS? #9

Open XinYu-Andy opened 3 years ago

XinYu-Andy commented 3 years ago

Hi, I am confused with the shape of IFS. In the paper, you say that you use the block-IDCT, and the shape of block is 88. Is that mean you divide the 6464 sized feature into 64 pieces of 88 patch and do IDCT for each patch?? And then the output should also be 6464? image

zhenngbolun commented 3 years ago

I think the DDCN(ECCV'16) might give a more figurative explanation about introducing block-IDCT in convolution layer.

XinYu-Andy commented 3 years ago

I think the DDCN(ECCV'16) might give a more figurative explanation about introducing block-IDCT in convolution layer. Thank you very much! But I still have some questions about the implementation.

  1. You mentioned that you adopted a multi-scale strategy to design the loss, so which method you adopted to downsample the groundtruth? 2.How many epochs have you trained totally? I have trained 700 epochs up to now, but I still got a low test psnr(about 28-31) on the Validation data of AIM19. 3.You mentioned that you adopted an advanced Sobel_loss(ASL) which contains four directions. I am not sure if I understand it. If we denote the four filtered images by G_x, G_y, G_z, G_w, does that mean Sobel(Z) =\sqrt(G_x^2+G_y^2+G_z^2+G_w^2), or something like Sobel(Z) =0.5*\sqrt(G_x^2+G_y^2+G_z^2+G_w^2)?

Thank you very much!

zhenngbolun commented 3 years ago
  1. Bicubic is adopted to generate corresponding GT.
  2. I suggest you follow the training settings provided in our paper. The whole training roughly need 600K itrs.
  3. Well, I just checking the Eq.12 in the paper. It's actually a little misleading. The ASL is actually MAE(G_x(input)-G_x(gt))+MAE(G_y(input)-G_y(gt))+MAE(G_w(input)-G_w(gt))+MAE(G_z(input)-G_z(gt)). And I suggest you using tf.image.sobel_edges to implement this operation.
XinYu-Andy commented 3 years ago
  1. Bicubic is adopted to generate corresponding GT.
  2. I suggest you follow the training settings provided in our paper. The whole training roughly need 600K itrs.
  3. Well, I just checking the Eq.12 in the paper. It's actually a little misleading. The ASL is actually MAE(G_x(input)-G_x(gt))+MAE(G_y(input)-G_y(gt))+MAE(G_w(input)-G_w(gt))+MAE(G_z(input)-G_z(gt)). And I suggest you using tf.image.sobel_edges to implement this operation.

Thank you very much for your reply!

XinYu-Andy commented 3 years ago

Hi, I am confused with the training settiings. How to define 'converge?' when you say ''When the 128 × 128 patch trained model converged, we re-grouped the training data into 256 × 256 patches for fine-tuning the model. This time, the learning rate was set to 10−5, the batch size was set to 4''.