hufu6371 / DORN

468 stars 105 forks source link

Kitti Benchmarking #8

Closed nicolasrosa closed 6 years ago

nicolasrosa commented 6 years ago

Hello,

First, congratulations on your results. I'm also working with Monocular Depth Estimation and I have some questions about the metrics used in the Kitti Depth Prediction Benchmark.

1) Does your Network predict depth in meters (m)?

2) If yes, did you change anything for applying the following metrics?

SILog: Scale invariant logarithmic error [log(m)*100] iRMSE: Root mean squared error of the inverse depth [1/km]

I'm asking because they use these different units: [log(m)*100)] and [1/km].

hufu6371 commented 6 years ago

@nicolasrosa Hi, Yes, our network predicts depth in meters. For Tab.3, we have trained and evaluated our model following Eigen et al. [12]. The evaluation metrics are different with those suggested by the KITTI evaluation server. Best

rauldiaz commented 6 years ago

Hi,

Regarding these metrics, how do you compare your results using the devkit provided in the page? I'm working in a monocular depth prediction approach. My solution predicts depth in meters, but when I evaluate my results using the evaluate_depth tool from the devkit, I get very small numbers. For instance, SILog is always smaller than 1. Is this simply a percentual value and the solution is simply multiply whatever I get from the devkit by 100, or should I change something else to get these metrics to the same scale as in the kitti results table?

Thanks

nicolasrosa commented 6 years ago

Hello @rdiazgar, I haven't evaluated my method on Kitti's Depth Prediction Evaluation Benchmarking yet. For now, I suggest you recheck the units of your code and make sure the ground truth's, predictions' and the devkit code's units match. We are having a similar discussion on the following link.

evaluation Eigen split

I hope it helps.

rauldiaz commented 6 years ago

Hi @nicolasrosa ,

Thanks for your quick response and help provided. I think that the link you provided does not fully resolve my doubt. I am certain that my approach returns depth in meters. The basic metrics like mae and rmse I observe are within the ballpark of other methods, give or take.

However, for the specific metrics used in kitti's ranking page (SILog, iRMSE, etc), I get very small figures from kitti's devkit. Going back to your own question at the beginning of this issue, I guess that my question is how do you guys compute SILog or iRMSE. Kitti's page refers to [log(m) * 100] and [1/km] for these metrics. For example, do you compute SILog to evaluate your method? And if you do, do you get values around the ballpark of kitti's page? Is it simply computing SILog in meters and then multiply the number by 100 for readability?

Thanks

nicolasrosa commented 6 years ago

No, I didn't. Sorry, I'm also confused about that. For State-of-the-Art Comparison, I only used the metrics provided by Monodepth's Evaluation code.

monodepth/utils/evaluate_kitti.py monodepth/utils/evaluation_utils.py

def compute_errors(gt, pred):
    thresh = np.maximum((gt / pred), (pred / gt))
    a1 = (thresh < 1.25     ).mean()
    a2 = (thresh < 1.25 ** 2).mean()
    a3 = (thresh < 1.25 ** 3).mean()

   rmse = (gt - pred) ** 2
   rmse = np.sqrt(rmse.mean())

   rmse_log = (np.log(gt) - np.log(pred)) ** 2
   rmse_log = np.sqrt(rmse_log.mean())

   abs_rel = np.mean(np.abs(gt - pred) / gt)

   sq_rel = np.mean(((gt - pred) ** 2) / gt)

   return abs_rel, sq_rel, rmse, rmse_log, a1, a2, a3
xyp8023 commented 5 years ago

Hi, I also have some doubts about SILog. I think the author seems to point out RMSE in log scale is scale variant, thus the author proposes SILog, which is scale invariant. My question is that how is RMSE in log scale, scale variant? Apparently rmse_log(y, y_hat) = rmse_log(cy,cy_hat), isn't it? I'm confused why do we need SILog...

Thanks

adizhol commented 4 years ago

@xyp8023 https://stats.stackexchange.com/questions/190868/rmse-is-scale-dependent-is-rmse