amazon-science / self-supervised-amodal-video-object-segmentation

MIT No Attribution
17 stars 6 forks source link

Input to the self.mask_decoder #9

Open tanghaotommy opened 1 year ago

tanghaotommy commented 1 year ago

Dear authors,

Thanks for sharing the codes for this great work! I have a question on the input the mask_decoder in the code:

According to the paper and comments in the code, the input to the mask_decoder should be the embedding and the visible mask. However, here shows the input to the mask_decoder is the embedding and the 5th channel of obj_patches_foward, which seems to be the optical flow x, not the visible mask (from the dataset here). Indexing the visible mask might be obj_patches_foward[..., 3], since obj_patches_foward[..., [0, 1, 2]] is rgb. I am wondering would this make a difference? Or did I miss anything?

I attach a few plots for different channels of obj_patches_foward obj_patches_foward[0, 0, ..., 3]:

obj_patches_foward_3

obj_patches_foward[0, 0, ..., 4]:

obj_patches_foward_4

Thank you in advance and looking forward to your reply!

nigelyaoj commented 1 year ago

Good catch! Actually, we had observed the issue before and found that the results from applying a visible mask and optical flow x are similar, based on our experiments.