google / sky-optimization

Repository and website for Sky Optimization: Semantically aware image processing of skies in low-light photography
https://google.github.io/sky-optimization/
Apache License 2.0
52 stars 15 forks source link

Detail about the smooth upsample #7

Open Ianresearch opened 4 years ago

Ianresearch commented 4 years ago

Hello, we realize the modified guided filter follow the pseudocode. The smooth upsampling we do as follow:

  1. biliner upsampling by s=4;
  2. smooth the upsampling result;
  3. biliner upsampling by s=4;
  4. smooth the upsampling result;
  5. biliner upsampling by s=4; Is it the right processing procedure to do the smooth upsampling? If the processing procedure is right, how to choose the kernel for smooth filter? Another question is that how to use the weighted downsample when inference. Because of the confidence map is low resolution , how to use the function(4):modified_guided_filter, which the C is high resolution. Thank you.
orlyliba commented 4 years ago

Not exactly. We don't smooth after the upsampling.

Smooth upsampling is achieved by applying tent-shaped convolution kernels consecutively (in our case, in 3 steps), which effectively changes the shape of the interpolation kernel to be smoother, and significantly reduces upsampling artifacts. For example, for a downsampling factor s = 64, instead of upsampling with a single tent kernel with a support of 64 × 64, we use tent kernels with a support of 4 × 4 three times, one after the other. We chose this method of linear upsampling rather than a more advanced method owing to its separability and efficient implementation.

The confidence is at low resolution and is applied at low resolution. The guided filter is solved at low resolution, and then upsampled.

Ianresearch commented 4 years ago

Thank you for the quick reply. However, we can’t realize the effect compared with the paper. 1.What is the tent kernel during the upsampling? Is it the just bilinear upsample (cv2.resize(interpolation=cv2.INTER_LINEAR)) or using the tent-shapled kernel to convolution such as [1, 2, 3, 2, 1], [2, 4, 6, 4, 2], [3, 6, 9, 6, 3], [2, 4, 6, 4, 2], [1, 2, 3, 2, 1]. What is meaning of the support of 4×4? 2.In our result, we can’t find the detail in tree branch and some areas are reversed, the supposed to be black region(such as the tree branch or other background) turned to the white in mask. What do you suggest to us? 3.When creating the ADE20K+DE+GF dataset, the s = 16, how to choice the three-step magnification?

marmarmarmar commented 4 years ago

I would like to follow up on this question:

  1. If you apply a consecutively 3 filters with a support of 4x4 you should get a filter with a support of 10x10. Do you use an intermediate upsampling in between the convolutions?
  2. Is the tent filter the same as @Ianresearch mentioned?
orlyliba commented 4 years ago

Apologies: I missed the message from @Ianresearch .

We upsample with a bilinear (tent) kernel sequentially (3 times). We use the separability of bilinear upsampling and apply it in one dimension each time. The upsampling is implemented as bilinear interpolation, not just convolution with a kernel. Upsampling by a scale of 4 is effectively only looking at 2 pixel values in each dimension and interpolating between them (the kernel in @Ianresearch is not what we use, and looking back at what I wrote before is indeed confusing).

I hope this is clearer now. If you have other questions please start them in separate threads. I didn't reply to older questions (2, 3 by @Ianresearch ) because I need more details.

MateuszOlko commented 4 years ago

I am also struggling to reproduce the results. I have similar problems as @Ianresearch described in 2. @orlyliba Could you please provide me with some answers to my questions?

  1. Could you explain the difference between "upsampling with bilinear (tent) kernel" and regular bilinear interpolation (as for example in cv2.resize)
  2. In the paper (and above) you mention interpolation kernels of certain support. What are these kernels? How does kernel support size affects bilinear interpolation result?

    Upsampling by a scale of 4 is effectively only looking at 2 pixel values in each dimension and interpolating between them.

  3. How are kernels related to this procedure?
  4. Could you please be more specific on the procedure? How does this procedure imply scale of 4? One can interpolate any number of pixels between chosen 2.
orlyliba commented 4 years ago

I am also struggling to reproduce the results. I have similar problems as @Ianresearch described in 2. @orlyliba Could you please provide me with some answers to my questions?

  1. Could you explain the difference between "upsampling with bilinear (tent) kernel" and regular bilinear interpolation (as for example in cv2.resize)

I think they are the same. Our implementation is with Halide, not CV, so we implemented upsampling ourselves.

  1. In the paper (and above) you mention interpolation kernels of certain support. What are these kernels? How does kernel support size affects bilinear interpolation result?

Upsampling by a scale of 4 is effectively only looking at 2 pixel values in each dimension and interpolating between them.

  1. How are kernels related to this procedure?

(answer to both 2 and 3) You can just ignore the kernel interpretation if you'd like, and think of this as concatenated bilinear upsampling. Let's look at a 1D example. The first step of upsampling uses only 2 pixels in a linear weighted average to create each pixel in the upsampled image. In the next upsampling stage, each new pixel is also created from linearly averaging 2 pixels, but now the weights are no longer linear (= tent) with respect to the original image pixels. The non-linearity results in a smoother result. This process, of consecutive bilinear upsampling, is effective to upsampling and interpolation with a kernel, and you can do the math and find the kernel coefficients (those are the weights in the weighted average).

  1. Could you please be more specific on the procedure? How does this procedure imply scale of 4? One can interpolate any number of pixels between chosen 2.

The procedure is consecutive bilinear upsampling. Instead of bilinear upsampling by a scale of x64, we upsample x4 three times and get to the same final resolution.

I hope this is clearer now.

orlyliba commented 3 years ago

Hi! Please see this code submission for the guided filter: https://github.com/google/sky-optimization/pull/14/commits/8a35938afe8dc0da931d960bfe05d0c99a9b40e3