fabiotosi92 / NeRF-Supervised-Deep-Stereo

A novel paradigm for collecting and generating stereo training data using neural rendering
https://nerfstereo.github.io/
MIT License
348 stars 19 forks source link

Regarding Depth Range Consistency in Different Scenes #44

Closed wtyuan96 closed 9 months ago

wtyuan96 commented 9 months ago

Hi there,

Firstly, I want to express my appreciation for the excellent work you've been doing.

I've been following the discussions around the reconstruction scales in colmap. I've noticed that the reconstruction in Instant-NGP, along with the rendered depth, maybe involves an arbitrary scale, leading to potential variations in depth scales across different scenes. This becomes particularly pronounced when considering scenes of similar physical size but reconstructed with different scales.

My main query revolves around the selection of three virtual baselines (b = 0.5, 0.3, 0.1 units) for data generation across all scenes, as mentioned in your paper. Considering that scenes, such as A and B, may have distinct reconstruction scales in colmap, resulting in different depth ranges, I'm curious about the reasoning behind using the same baselines for all scenes. Given the potential disparity in depth range caused by different reconstruction scales, how does the uniform application of baselines account for this variability?

I appreciate your time and insights into this matter.

Thank you in advance!

fabiotosi92 commented 9 months ago

Hi!

Thank you for your kind words! We really appreciate your interest in our work.

The reason we opted for the three virtual baselines (b = 0.5, 0.3, 0.1) across all scenes is because we wanted to make sure our generated depth maps cover the entire disparity range (from 0 to the desired D_max). In Section 4 of the paper, particularly highlighted in Figure 5, we briefly discussed this aspect.

So, in our work, we're not particularly concerned about scenes like A and B having different reconstruction scales in colmap, resulting in varied depth ranges. Instead, our main goal is to create disparity maps that span a wide range of disparities. We achieve this by playing with different baseline values.

Moreover, a key point to mention in that our approach offers the flexibility to achieve extensive disparity ranges even with images sourced from outdoor scenarios, resulting in depth variations very different from those observed in our indoor scene dataset.

I hope this clarifies your doubts on the baseline selection. Feel free to ask if you have more questions.

Thanks again for your interest!