A question about appearance flow

RenYurui / StructureFlow

Code for paper "StructureFlow: Image Inpainting via Structure-aware Appearance Flow"

Other

222 stars 42 forks source link

A question about appearance flow #10

Closed XiaoqiangZhou closed 4 years ago

XiaoqiangZhou commented 4 years ago

Dear researcher, I have a few questions about the appearance flow you mentioned in the paper. Could you please help me?

I understand equation (8) as a weighted sum of neighbor features, and the weight is correlated with the spatial distance to the sampling center. So, is there anything learnable parameter for this "weight map"? If not, how does this method achieve a attention-like influence in your visualization of the appearance flow?
In your implementation, the corrupted image structure is input_corrupted_structure_image = mask * (smooth(gt_image)). Isn't is should be input_corrupted_structure_image = smooth(mask * gt_image)?

Thanks!

RenYurui commented 4 years ago

The sampling kernel does not have learnable parameters. It is a Gaussian kernel with fixed variance and kernel size. We tried to train the variance of the Gaussian kernel as a learnable parameter. But we find that the training process will be unstable. We use the sampling correctness loss to train our model. This loss can determine whether the current sampled regions are ”good” choices. It helps the flow fields find the correct sampling regions.
Because the smooth method is a MATLAB code, it is a troublesome thing to call this code in python every time. Therefore, we directly use input_corrupted_structure_image = mask * (smooth(gt_image)) to calculate the input structures in our implementation for computational efficiency. We will try to convert this code to python when we finish some urgent tasks at hand.

XiaoqiangZhou commented 4 years ago

@RenYurui Thanks for your quick and detailed relpy!

I have got your explanation for the second question.

But I'm still a little confused with "It helps the flow fields find the correct sampling regions". Isn't the sampling region fixed for a sampling center? From my opinion, the correctness loss will guide the generated feature to be like gt_image's feature in a semantic way. But I do not understand how will this loss change the perference of guassian sampling regions, which is determined by hyper-parameters, i.e., delta_h, delta_v, sigma.

By the way, in your paper, what does "The sampling process calculates the gradients according to the input pixels (features)" mean? I didn't get the formulation of gradient computation in sampling process.

Best regards.

RenYurui commented 4 years ago

Delta h and delta v are the offsets of a specific points. (Delta h, Delta v) is one point of the flow field. When (delta h, delta v) changes, the sampling region will be changed. Our sampling correctness loss helps the network find the correct sampling regions (obtain reasonable delta h delta v for each point )

The gradient of the sampling operation can be easily obtained according to the forward process. For example, the bilinear sampling uses 4 local pixels as inputs. Output=hx1+vx2+(1-h)x3+(1-v)x4 The gradient of h is zero when x1=x3. The gradient of v is zero when x2=x4. Unfortunately, pixel values of a local patch are always similar. Therefore we use Gaussian sampling to extend the receptive field of the sampling operation.

XiaoqiangZhou commented 4 years ago

@RenYurui Got it! Thanks~