Hi, Thanks for sharing this amazing work on video matting!
I'm trying to reproduce the numbers in Table 1 of the paper and have some questions here:
In table 1, all the results are under training stage 1,2,3 and 4, right? I trained the model for stage 1,2,3 and got the results 12.16 / 3.08 (MAD/MSE) on VM while in the paper it is 6.08/1.47 (MAD/MSE) on VM.
How important is the 8k image backgrounds for reproducing the numbers in Table 1? I used the 200 image backgrounds in test set that you released for training stage 1,2,3.
The overlap about training / test set on video backgrounds. In paper, you mentioned 3118 clips (while in dvm_background_train_clips.txt there are 3117 lines) are selected for training, while in dvm_background_test_clips.txt there are some overlap clips with training set (like 0245/0246), does it mean that we need to manually remove them during training? By the way, in generate_videomatte_with_background_video.py, 0245/0246 are also selected for compositing test set.
Hi, Thanks for sharing this amazing work on video matting!
I'm trying to reproduce the numbers in Table 1 of the paper and have some questions here:
In table 1, all the results are under training stage 1,2,3 and 4, right? I trained the model for stage 1,2,3 and got the results 12.16 / 3.08 (MAD/MSE) on VM while in the paper it is 6.08/1.47 (MAD/MSE) on VM.
How important is the 8k image backgrounds for reproducing the numbers in Table 1? I used the 200 image backgrounds in test set that you released for training stage 1,2,3.
The overlap about training / test set on video backgrounds. In paper, you mentioned 3118 clips (while in dvm_background_train_clips.txt there are 3117 lines) are selected for training, while in dvm_background_test_clips.txt there are some overlap clips with training set (like 0245/0246), does it mean that we need to manually remove them during training? By the way, in generate_videomatte_with_background_video.py, 0245/0246 are also selected for compositing test set.
Could you help elaborate on them? Thanks.