z-x-yang / CFBI

The official implementation of CFBI(+): Collaborative Video Object Segmentation by (Multi-scale) Foreground-Background Integration.
BSD 3-Clause "New" or "Revised" License
322 stars 43 forks source link

lower results of evaluation on youtube-vos and davis2017 #45

Closed zhouweii234 closed 3 years ago

zhouweii234 commented 3 years ago

I train CFBI and evaluate it on youtube-vos and davis2017 .But the result is lower than yours. The best result of youtube-vos is score 0.796, J_seen 0.789, J_unseen 0.740, F_seen 0.833, F_unseen 0.81. These results are about 0.03 lower than that on GitHub. The best result of davis2017 is J&F-Mean 0.774, J-Mean 0.751, F-Mean 0.796. These results are about 0.04 lower than that on GitHub.

I set the batch_size 4 when training youtube-vos and set global_chunks 20 when evaluating it. I set the batch_size 2 when training davis2017. I didn't download the datasets from the links in the README.md but downloaded it from the official website. I set the self.TRAIN_TBLOG True in the resnet101_cfbi_davis_finetune.py when training davis2017 and make the changes in the code according to your answer https://github.com/z-x-yang/CFBI/issues/44#issue-899027176.

Do you know why the result is not ideal, did I do something wrong?

z-x-yang commented 3 years ago

You used only half of the default batch size. Did you adjust training steps and learning rate to be in harmony with the batch size?

The training steps should be double, and the learning rate should be half as well, leading to some results similar to ours.

If you want to reproduce the exact results by using only half of the default batch size, you could accumulate two batches of gradients before each parameter update to simulate one default iteration.