Open Mixanik-43 opened 1 year ago
Adding --mesh_deformation
slightly improved the result (0.25 abs rel on TUM/freiburg1_desk sequence)
Adding more options from the GCVD paper (python3 main.py video.mkv --name test --pose_graph --post_filter --mesh_deformation --num_epoch_edge 100 --keyframe_scales 4 --loss_smooth 0.1
) increased abs rel on this sequence to 0.34, which is far from the reported 0.094 abs rel.
So @yaochih, could you please provide the correct command to reproduce GCVD paper results, or could you share your metric evaluation script? Originally MiDaS evaluates abs rel after adjusting inverse depth shift and scale. I thought, since GCVD estimates scales, you do not use such adjustment in metric evaluation. Is that right?
It seems like depth scale is adjusted to match ground truth. Here is the corresponding part in the arxiv paper: Now I obtained metrics comparable to the reported ones! However, I still did not manage to find options that would reproduce the reported metrics exactly.
It seems like depth scale is adjusted to match ground truth. Here is the corresponding part in the arxiv paper: Now I obtained metrics comparable to the reported ones! However, I still did not manage to find options that would reproduce the reported metrics exactly.
Could you please share your test code with me? Thank you.
Hi, thank you for the great work! I've tried to reproduce the results you've published for TUM and 7 scenes datasets. However, the depth estimation metric (abs rel) values I obtained are significantly worse then the ones reported in the paper. For 7_scenes/chess/seq-01 sequence I got abs rel = 0.83 with the command you provided in Readme:
python3 main.py test-dataset/chess/seq-01.mkv --name test --pose_graph
. I tried using--post_filter
as well, but it did not change the result significantly. You reported abs rel = 0.124 on 7 scenes, so I suppose you had << 83% relative error on chess/seq-01.For TUM/freiburg1_desk I got 0.2718 relative error, while you reported 0.0940 for this sequence.
I tried to find the reason of the high depth estimation errors. It seems like GCVD depth estimates often have wrong scale. If I adjust scales to the estimated depth maps to match ground truth depth (individual constant scale for each frame), I can get 0.0897 relative error for TUM/freiburg1_desk sequence, which seems to be a reasonable result, therfore correcting the wrong scale could potentially lead to the results reported in the paper. Here is an example of a frame from TUM/freiburg1_desk where GCVD output has high relative error (0.507). In the original TUM dataset its name is rgbd_dataset_freiburg1_desk/rgb/1305031464.759740.png
As you can see, GCVD output has a reasonable structure, but the scale is wrong, which affects abs rel metric.
I used the files that
main.py
script generated inoutputs/test/depths/final
directory (oroutputs/test/depths/filtered
for runs with--post_filter
option) as the final output of GCVD algorithm. Is it right, or should I additionally scale these depth maps somehow? Should the commandpython3 main.py video.mkv --name test --pose_graph --post_filter
reproduce the result reported in the paper or should I use some additional options? Are there any other tips for reproducing these results?