NVlabs / neuralrgbd

Neural RGB→D Sensing: Per-pixel depth and its uncertainty estimation from a monocular RGB video
Other
301 stars 41 forks source link

Some confusion on inference on 7scenes #13

Closed flamehaze1115 closed 4 years ago

flamehaze1115 commented 4 years ago

Hello! Thank you very much for your work. I follow your instructions about inference on 7scenes dataset, and just run the command you write in the TE.md. I do inference on all test-split of 7-scenes dataset, because you code just run inference on images with id (id % 3 == 0). I evaluate all the inferenced results with groundtruth. I resized the groundtruth to (384, 256), and calculate metrics on depth range (0.1, 5.0). But weirdly the evaluation scores are lower than your report. mean_abs_relative: 0.2446 mean_rmse: 0.5504 mean_scale_invariant: 0.1931 In your report : 0.1758 0.4408 0.1899 respectively. Maybe I miss some steps.

cxlcl commented 4 years ago

For the reported results, we run the inference on every single input frame (id %1==0). In this case, the baselines between adjacent frames need to be adjusted as well ( three times as the default). This is very similar to the setup in the test_KV_LBA.py where we want as small baselines as possible.

BTW, the reason why the scores are lower here is because the larger baselines: the propagation of the depth probability volume becomes harder as the baseline increases, since the overlapping FOV is smaller.

Hope this helps.