Can I train the Nuscene model on OpenPCDet?

TRAILab / CaDDN

Categorical Depth Distribution Network for Monocular 3D Object Detection (CVPR 2021 Oral)

Apache License 2.0

359 stars 62 forks source link

Can I train the Nuscene model on OpenPCDet? #88

Closed rockywind closed 2 years ago

rockywind commented 2 years ago

Hello, I'm trying to use the CaDDN to train the Nuscene model. But it seems that it can not be directly applied. Do you implement the Nuscene model using CaDDN? If not, could you please give me some advice on what to do if I'd like to train the Nuscene model with CaDDN? As I know, the Nuscene dataset has six cameras, but the CaDDN only ran on the front camera. Thank you.

codyreading commented 2 years ago

Hello,

You can take a look at the pointers I gave to run on the Waymo dataset (https://github.com/TRAILab/CaDDN/issues/80), the steps should be the same:

However, the nuScenes dataset only has a 30 beam LiDAR, so I anticipate the generated depth map labels to be quite poor (due to low point density). The method still should somewhat work however.

rockywind commented 2 years ago

The Nuscene dataset has six camera data, but the range of the Voxel feature of CaDDN is 2 m to 46.2. The data from the rear camera cannot be projected onto the front Voxel feature. How to convert the data of the rear camera to the front camera?

codyreading commented 2 years ago

I would recommend to stick with one camera and just evaluate on the front-view camera.

Alternatively, you can generate six frustum features, and then form a single voxel grid for the full 360 deg range of nuScenes. Just project each voxel center into all frustums and extract any features that exist in the Frustum FOV. In most cases, each voxel will only exist in one Frustum, but in the case of overlap you can just extract the average of the features in each camera.

rockywind commented 2 years ago

Thank you for your help! I want to transfer the coordinate system to the camera coordinate system. Then the voxel feature generated is based on the camera plane instead of the lidar coordinate, so, there is no need to expand the range of the RoI of voxel.

codyreading commented 2 years ago

Sure, you might want to adjust the PC range to get the best results but the 2m to 46.2 m should work as a starting point.

rockywind commented 2 years ago

Thank you for your help!

Cc-Hy commented 2 years ago

@rockywind Hello, have you succeeded in your plans?