lzccccc / SMOKE

SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint Estimation
MIT License
696 stars 177 forks source link

why smoke predict box3d so good? #84

Open Kerry678231 opened 1 year ago

Kerry678231 commented 1 year ago

author, i want to ask a few questions 1、 pred_box3d_locs is from pred_depths_offset * depth_ref 、 pred_proj_offsets and K calculation 2、depth_ref is denormalization values, am i rights, how to get this values? it can also affect box3d accuracy? 3、pred_depths_offset have not target values, it calculate loss through pred_box3d, and pred_box3d's accuracy is depend pred_depths? 4、recently i write a net, just predict orientation of 3d, not box3d,Very much hope to give guidance and help thanks a lot

Kerry678231 commented 1 year ago

i read paper about it,but i still have above question, please give me some guide, Thanks!

image

HassanHotait commented 1 year ago

image

HassanHotait commented 1 year ago

It's hard to predict absolute depth in meters with DNNs', instead it predicts a relative depth [unitless scalar], similar to normalized depth maps obtained from Stereo Vision, then this relative depth is converted to absolute depth in meters by computing an appropriate scale and offset.

In my opinion, this is the limitation of this algorithm, the scale and offset are calculated given the KITTI Official dataset. If you get the mean and std of the objects depth over the complete training dataset, you will get the values mentioned in the paper.

The algorithm uses a fixed scale and offset for all images, naturally this will not generalize to other scenarios with good enough accuracy.

Yes also these values affect the 3D and 2D Box location...

Anyone including the author or people who experimented with this could correct me if I'm wrong, this is my observation after a couple of months investigating SMOKE.