SamsungLabs / imvoxelnet

[WACV2022] ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection
MIT License
280 stars 29 forks source link

About gt coordinate in training #47

Closed JunjieLiuSWU closed 2 years ago

JunjieLiuSWU commented 2 years ago

Hello, I found that the coordinate of GT boxes is Lidar coordinate, which means the model gets img as input and predicts objects in lidar coordinate. It's very confusing.

filaPro commented 2 years ago

Hi @JunjieLiuSWU ,

Don't quite understand your question. This design choice helps us

1) to use KITTI, nuScenes, ScanNet and SUN RGB-D evaluators from mmdetectoin3d without changes; 2) use AnchorHead from mmdetection3d for outdoor dataset without changes.

Overall monocular detection was not supported in mmdetection3d at the time we worked on this paper.