Implementation details on epipolar error map

kcheng1021 commented 3 months ago

Hi, thanks for your excellent work MoSca!

Currently I am trying to follow some techniques in MoSca to handle the monocular dynamic scene. And I meet a small problem, which I really hope that you could help me to solve it. Here is my question:

In MoSca, you use the epipolar error map to segment the dynamic region, which is really elegant. I found this part can be refered to the code in robust dynamic radiance field in CVPR2023. And when I use it to segment the dynamic part in nvidia dynamic datasets and iphone datasets, the results seem poor or instable. Here is some example in apple scene (iphone) and skating scene (nvidia). I guess that some parameters should be adjusted to adopt to these scenes. It would be so nice for you to share some experience.

微信图片_20240606215753 微信图片_20240606215757

JiahuiLei commented 3 months ago

Thanks for your interest, the epi error computation is based on RoDyNeRF https://github.com/facebookresearch/robust-dynrf/blob/c56be705940c488cdfeac73d51c332d75101d60c/scripts/generate_mask.py

We compute the threshold as:

epi_th = (H * W) / (epi_error_th_factor**2)

Note the threshold of the epi error may depend on the image size and FPS of the video. For the NVIDIA dataset with RoDyNeRF setup, we set factor=100.0 and for iPhone dycheck we set factor=400.0.

We also use the connected dense flow and dilated long-pixel track to enhance the foreground segmentation, the purpose is to conservatively make sure that when solving the background, the background masked regions are really static.

kcheng1021 commented 2 months ago

Thanks for your detailed answers, the masks are better when following your instructment.

JiahuiLei / MoSca

Implementation details on epipolar error map #2