Owen-Liuyuxuan / visualDet3D

Official Repo for Ground-aware Monocular 3D Object Detection for Autonomous Driving / YOLOStereo3D: A Step Back to 2D for Efficient Stereo 3D Detection
https://owen-liuyuxuan.github.io/papers_reading_sharing.github.io/3dDetection/GroundAwareConvultion/
Apache License 2.0
361 stars 76 forks source link

Support for mono 3D random resize augmentation #72

Open ZCMax opened 1 year ago

ZCMax commented 1 year ago

Have you ever tried 3D random resize augmentation in monocular 3d object detection?

Owen-Liuyuxuan commented 1 year ago

Did you mean "RandomWarpAffine" which resizes and crops the image into a random size and then pads it to the target shape?

https://github.com/Owen-Liuyuxuan/visualDet3D/blob/15d4c376bd84eb33c003a59e4bb52bf1aa7fdf08/visualDet3D/data/pipeline/stereo_augmentator.py#L440

This augmentation requires the method to be robust to varying camera intrinsic, and is very important for centernet-based methods.

ZCMax commented 1 year ago

Yes, I'm confused that when resizing images, not only the 2d bboxes should be resized, the 3d labels like depth seems also need to be adjusted, but I have tried multi-scale training on several mono3d model, which results in the performance drop.

Owen-Liuyuxuan commented 1 year ago

Yes, you could read the code and know that depth images will resize following the main images.

However, the main problem is that many mono3d models directly predict depth, or like GAC we use a fixed prior for anchors. The depth prediction will failed for these methods.

ZCMax commented 1 year ago

Yes, you could read the code and know that depth images will resize following the main images.

However, the main problem is that many mono3d models directly predict depth, or like GAC we use a fixed prior for anchors. The depth prediction will failed for these methods.

Yes, that's the trouble which we need to consider