bo-miao / MAMP

[ICME 2022] Self-Supervised Video Object Segmentation by Motion-Aware Mask Propagation.
BSD 3-Clause "New" or "Revised" License
32 stars 6 forks source link

about the pair-wise training performance #4

Closed qianduoduolr closed 2 years ago

qianduoduolr commented 2 years ago

hi, thanks for your great work and code. It seems the code is based on MAST. However, I found your implementation gets a much better result with pair-wise training. In table 5, you get 66.7 without your matching and alignment module, which is better than 59 in MAST. Are there some training strategies or something special in your implementation both in the train and test stage that I missed?

bo-miao commented 2 years ago

Hi,

Yes, the code is based on MAST. Please refer to the up-to-date version of our paper since I cannot find Table 5 as you said. https://arxiv.org/pdf/2107.12569.pdf.

I would guess what you want to know is the code below: ''' corr = torch.cat(corrs, 1) / torch.sqrt(torch.tensor(c).float()) ''' We did not adopt any special training strategies in MAMP. For training, we noticed that MAST cannot converge well without rescaling the affinity (at least in the new version of PyTorch). So I only rescale the affinity to solve this problem and obtain that performance.

qianduoduolr commented 2 years ago

Nice, thanks for your reply. I will have a try later. btw, the performance mentioned above is 65.8 in your up-to-date version (In table 3). thanks again.

bo-miao commented 2 years ago

Yes, previously I use the evaluation method in MAST repo but found that It was not consistent with the official evaluation tool. So I correct this in the current version.

qianduoduolr commented 2 years ago

ok, nice job.