Closed sydney0zq closed 3 years ago
Hello,
We have contacted you vial email and provided our recent training log, and our advice are as follows:
As for your question with our paper, we suggest you go to https://davischallenge.org/challenge2017/index.html and read related papers. Our learning method during training time is supervised by the mask annotations from all frames in all videos. The problem is semi-supervised, because during inference time, only the annotation of the first frame is given. All VOS papers in semi-supervised literature follow this setting.
Thank you for your interest.
Hi, thanks for releasing the codes and I run your code completely but the results cannot be reproduced at all.
I use DAVIS 2017 as my training set, and evaluate the checkpoints on DAVIS 2017 validation set.
However the results are:
G/J/F: 67.2/65.2/69.2
But in your paper:
G/J/F: 72.3/69.9/74.7
I notice you use 4 GPUS with 16GB memory each and here I only has 4 GPUs with 11GB memory each. I think the hardware difference should NOT make such a significant difference. Could you please have an explanation on that because there are a few fellows issuing the same problems below.
Thanks in advance.
And what's more, your paper says it is a transductive method, however, your codes are TOTALLY different with the equations in your paper, and the masks are not semi-supervised, it is fully supervised by cross-entropy loss.
Please explain the issue which I think it is an essential problem in your paper.