Questions on reproducing the results

chrisjuniorli commented 3 years ago

Hi, Thanks for sharing this amazing work on video matting!

I'm trying to reproduce the numbers in Table 1 of the paper and have some questions here:

In table 1, all the results are under training stage 1,2,3 and 4, right? I trained the model for stage 1,2,3 and got the results 12.16 / 3.08 (MAD/MSE) on VM while in the paper it is 6.08/1.47 (MAD/MSE) on VM.
How important is the 8k image backgrounds for reproducing the numbers in Table 1? I used the 200 image backgrounds in test set that you released for training stage 1,2,3.
The overlap about training / test set on video backgrounds. In paper, you mentioned 3118 clips (while in dvm_background_train_clips.txt there are 3117 lines) are selected for training, while in dvm_background_test_clips.txt there are some overlap clips with training set (like 0245/0246), does it mean that we need to manually remove them during training? By the way, in generate_videomatte_with_background_video.py, 0245/0246 are also selected for compositing test set.

Could you help elaborate on them? Thanks.

PeterL1n commented 3 years ago

Yes, the model are evaluated after training all 4 stages.
This could potentially be a problem. You do want the background to be more diverse to avoid overfitting.
I don't think the background videos are the same even their name is the same. They are different videos.

chrisjuniorli commented 3 years ago

Thanks for the quick response. I just checked the video background files and they are indeed different videos. I'll close this issue, thanks

PeterL1n / RobustVideoMatting