yaochih / GCVD-release

MIT License
22 stars 3 forks source link

Reproducing depth estimation metrics #3

Open Mixanik-43 opened 1 year ago

Mixanik-43 commented 1 year ago

Hi, thank you for the great work! I've tried to reproduce the results you've published for TUM and 7 scenes datasets. However, the depth estimation metric (abs rel) values I obtained are significantly worse then the ones reported in the paper. For 7_scenes/chess/seq-01 sequence I got abs rel = 0.83 with the command you provided in Readme: python3 main.py test-dataset/chess/seq-01.mkv --name test --pose_graph. I tried using --post_filter as well, but it did not change the result significantly. You reported abs rel = 0.124 on 7 scenes, so I suppose you had << 83% relative error on chess/seq-01.

For TUM/freiburg1_desk I got 0.2718 relative error, while you reported 0.0940 for this sequence.

I tried to find the reason of the high depth estimation errors. It seems like GCVD depth estimates often have wrong scale. If I adjust scales to the estimated depth maps to match ground truth depth (individual constant scale for each frame), I can get 0.0897 relative error for TUM/freiburg1_desk sequence, which seems to be a reasonable result, therfore correcting the wrong scale could potentially lead to the results reported in the paper. Here is an example of a frame from TUM/freiburg1_desk where GCVD output has high relative error (0.507). In the original TUM dataset its name is rgbd_dataset_freiburg1_desk/rgb/1305031464.759740.png image

As you can see, GCVD output has a reasonable structure, but the scale is wrong, which affects abs rel metric.

I used the files that main.py script generated in outputs/test/depths/final directory (or outputs/test/depths/filtered for runs with --post_filter option) as the final output of GCVD algorithm. Is it right, or should I additionally scale these depth maps somehow? Should the command python3 main.py video.mkv --name test --pose_graph --post_filter reproduce the result reported in the paper or should I use some additional options? Are there any other tips for reproducing these results?

Mixanik-43 commented 1 year ago

Adding --mesh_deformation slightly improved the result (0.25 abs rel on TUM/freiburg1_desk sequence) Adding more options from the GCVD paper (python3 main.py video.mkv --name test --pose_graph --post_filter --mesh_deformation --num_epoch_edge 100 --keyframe_scales 4 --loss_smooth 0.1) increased abs rel on this sequence to 0.34, which is far from the reported 0.094 abs rel. So @yaochih, could you please provide the correct command to reproduce GCVD paper results, or could you share your metric evaluation script? Originally MiDaS evaluates abs rel after adjusting inverse depth shift and scale. I thought, since GCVD estimates scales, you do not use such adjustment in metric evaluation. Is that right?

Mixanik-43 commented 1 year ago

It seems like depth scale is adjusted to match ground truth. Here is the corresponding part in the arxiv paper: image Now I obtained metrics comparable to the reported ones! However, I still did not manage to find options that would reproduce the reported metrics exactly.

mathidot commented 1 year ago

It seems like depth scale is adjusted to match ground truth. Here is the corresponding part in the arxiv paper: image Now I obtained metrics comparable to the reported ones! However, I still did not manage to find options that would reproduce the reported metrics exactly.

Could you please share your test code with me? Thank you.