microsoft / MaskFlownet

[CVPR 2020, Oral] MaskFlownet: Asymmetric Feature Matching with Learnable Occlusion Mask
https://arxiv.org/abs/2003.10955
MIT License
369 stars 70 forks source link

Question regarding Occlusion-Aware Pyramid #41

Open BailiangJ opened 1 year ago

BailiangJ commented 1 year ago

Hi,

I have a question regarding the Occlusion-Aware Pyramid.

In the paper, it writes

image

in the code, it is

mask0 = Upsample(4)(mask2)  
mask0 = F.sigmoid(mask0) - 0.5  
c30 = c10  
c40 = self.warp(c20, Upsample(4)(flow2)*self.scale)  
# concat image 1 with zero mask
c30 = F.concat(c30, F.zeros_like(mask0), dim=1)  
# concat warped image 2 with occlusion mask
c40 = F.concat(c40, mask0, dim=1)  

From my understanding, the occlusion mask is a probability map (where 1 stands for occlusion and 0 stands for non-occlusion), and after subtraction by 0.5, the range would be [-0.5, 0.5], and value 0, in this case, would mean "don't know whether there is occlusion or not".

Then the question is why image 1 I1 is concatenated with a zero mask instead of a -0.5 mask, or the same occlusion map as image 2 I2? Since the follow-up conv layers are shared for variables c30 and c40, shouldn't the concatenated occlusion mask have the same meaning for both I1 and I2 ?

Thanks a lot!