zju3dv / disprcnn

Code release for Stereo 3D Object Detection via Shape Prior Guided Instance Disparity Estimation (CVPR 2020, TPAMI 2021)
Apache License 2.0
213 stars 36 forks source link

Excuse me, Could you tell me how do you compute the depth RMSE #38

Closed jichaofeng closed 2 years ago

jichaofeng commented 2 years ago

The Tab. 3 in the article shows the depth RMSE for IDispNet . I want to know how do you computer the depth RMSE. I try to compute the depth RMSE for object regions, the result is much larger than you given in Tab. 3.

ootts commented 2 years ago

It depends on how you define the "object regions". Computing on pixels that are predicted to be foreground is not reasonable since they may turn out to be on the background and have very large depth values, which may contribute to very large depth RMSE. Intersecting the predicted mask and pseudo-ground-truth mask as the foreground region and ignoring the pixels with very large depth RMSE will give you a similar result.

jichaofeng commented 2 years ago

Thank you for your reply. I use the depth prior to guide the instance depth estimation. The reviewer ask me to do the depth error estimation compared with Disp R-CNN and ZoomNet. I obtain the large depth RMSE using my method. What is the select threshold value when you compute the depth RMSE, Do you remember?

ootts commented 2 years ago

I use 1000 as the threshold empirically but you could use a different one. It may be a good idea to use the same set of pixels to compare your method with the baselines. A pixel may be selected when its corresponding (predicted) 3D point is within a certain threshold (maybe 5m) of the ground-truth bounding box since the points too far away from the GT bounding box may have little effect on the bounding box estimation.

jichaofeng commented 2 years ago

We compute the depth RMSE using the formula as: image We use the formula to compute the depth RMSE for each instance, where N is the number of pixel wich is belong to the instance and has the valid groundtruth. than we take the everage of the depth RMSE for all the instance. Is there a problem? Generally, the the depth RMSE for each instance is within 10. I do not know how to use the threshold of 1000.

ootts commented 2 years ago

I use the same formula. Your number does not have to be exactly the same as my paper since the set of pixels used for evaluation may not be same.

jichaofeng commented 2 years ago

Thanks very much for your reply. The threshold of 1000 is used to exclude the pixels whose error is large when compute the depth RMSE? Is the threshold is large?

ootts commented 2 years ago

You could use a smaller one as long as the results can prove the quality of your method.

jichaofeng commented 2 years ago

OK!Thank you very much again.