Owen-Liuyuxuan / visualDet3D

Official Repo for Ground-aware Monocular 3D Object Detection for Autonomous Driving / YOLOStereo3D: A Step Back to 2D for Efficient Stereo 3D Detection
https://owen-liuyuxuan.github.io/papers_reading_sharing.github.io/3dDetection/GroundAwareConvultion/
Apache License 2.0
361 stars 76 forks source link

why divided by 16 when loading dispariy? #63

Closed leeyegy closed 2 years ago

leeyegy commented 2 years ago

https://github.com/Owen-Liuyuxuan/visualDet3D/blob/master/visualDet3D/data/kitti/dataset/stereo_dataset.py#L121

image

I have no idea why the disparity has to be divided by 16. After division, is the disparity map correspondent to the original input size or 1/16 input size?

Owen-Liuyuxuan commented 2 years ago

That is related to how the disparity is transformed and computed by OpenCV.

The disparity is computed here. If you check the documentation, read some tutorials, you could find out that OpenCV returns int16(disparity * 16). That is how "16" comes from here.

If you use lidar point the computation happens here. The lidar-based computation times 16 also in complies with OpenCV's API.

BTW, in Yolostereo3D, the disparity loss happens at scale 1/4 (hence we spatially downsample disparity gt at precomputation by 4), but we are computing the disparity at the original scale.

leeyegy commented 2 years ago

Thanks a lot for ur quick reply~