Junyi42 / monst3r

Official Implementation of paper "MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion"
https://monst3r-project.github.io/
514 stars 12 forks source link

Scale-only evaluation #7

Open Davidyao99 opened 23 hours ago

Davidyao99 commented 23 hours ago

Amazing work! May I ask how did you perform your scale-only alignment for evaluation? I assume that you are using the depthcrafter evaluation script? I am running some evaluations and am able to reproduce the depthcrafter results using shift and scale. However, using only scale, I seem to be getting much better results for depthcrafter than the results in the table.

For my evaluation, I simply changed the line found here to the following:

A = pred_disp_masked
X = np.linalg.lstsq(A, gt_disp_maksed, rcond=None)[0]

scale = X[0] # gt = scale * pred + shift
aligned_pred = scale * pred_disp

With this changes, I am getting abs_rel error of 0.324 and d1<1.25 of 0.4734 for depthcrafter. This is kind of strange since the abs_rel error seems pretty good but d1 is bad. Could you provide some guidance on how the scale-only evaluation is performed? Thank you!!

Junyi42 commented 6 hours ago

Hi @Davidyao99,

Thank you for raising this point!

In our evaluation, we follow the convention used in works like Robust-VCD, CasualSAM, and DUST3R, where we apply "median-scaling" for scale-only alignment. While we also experimented with other methods, such as "mean-scaling" and optimization-based alignment, none provided better results for both Abs Rel and d1<1.25 metrics compared to median-scaling.

Regarding your approach using lstsq, it's worth noting that this method penalizes the square norm, making it more sensitive to outliers. This sensitivity may explain why you're seeing improved performance in Abs Rel but a worse outcome for d1<1.25.

Best.