Open tb2-sy opened 1 year ago
Hi, I think Maskk-RCNN + epipolar distance could work for long videos if estimated optical flow is accurate enough. However, it could fail if estimated flow is not accurate (somehow this issue always happens for the dynamic scenes datasets)
Indeed, we recently found that for in-the-wild monocular videos, initializing our motion segmentation module with those masks can provide more robust estimates, especially for degenerate cases such as co-linear camera-object motion, where photmometric inconsistency might not be sufficient to explain away the motion.
Hi, I think Maskk-RCNN + epipolar distance could work for long videos if estimated optical flow is accurate enough. However, it could fail if estimated flow is not accurate (somehow this issue always happens for the dynamic scenes datasets)
Indeed, we recently found that for in-the-wild monocular videos, initializing our motion segmentation module with those masks can provide more robust estimates, especially for degenerate cases such as co-linear camera-object motion, where photmometric inconsistency might not be sufficient to explain away the motion.
Thanks for your reply, I am now experimenting on the scene (Famil) with dynamic elements in the Tank and template dataset. What confuses me is that the result of the motion mask is that the whole picture is completely white, that is, the whole picture is regarded as dynamic, but in fact there are only few dynamic elements in the picture.
Are you referring the method in dynibar or " Maskk-RCNN + epipolar distance"?
Are you referring the method in dynibar or " Maskk-RCNN + epipolar distance"?
I am referring to the "Maskk-RCNN + epipolar distance" method that has failed. I am not sure whether it is the limitation of this method itself, or what I did wrong, thank you.
You should check if all the moving regions are included in mask-rcnn part or epipolar thresholding part. From my experience, this method should not fail completely for Tank and temple dataset.
Thanks, I got it!
Thanks for your great work! The motion segmentation method you proposed is very novel, but I would like to ask whether the previous mask-rcnn+epipolar distance method in dynamic nerf will cause the motion mask to fail due to the long video and complex camera trajectories setting in this work?