Cc-Hy / CMKD

Cross-Modality Knowledge Distillation Network for Monocular 3D Object Detection (ECCV 2022 Oral)
Apache License 2.0
107 stars 9 forks source link

Question about inference (Lidar camera calibration) #11

Open alright1117 opened 1 year ago

alright1117 commented 1 year ago

Hi, thanks for sharing the nice source code.

Since bounding boxs and image voxel features generated by the model are in lidar coordinate, should the relative camera pose to LiDAR be fixed when inference? In other words, the lidar camera extrinsic parameter used in inference shoud be same as in training?

Cc-Hy commented 1 year ago

@alright1117 Hello, the calibration matrices (intrinsic and extrinsic) are sample-specific and pre-given by these datasets. That is, each training sample, both training and testing, has its own calibration matrices which will be loaded together with the sample data. Taking kitti as an example, the calibration files are stroed in data/kitti/testing(training)/calib. So the calibration matrices are not fixes, we load them from files during training and inference.

alright1117 commented 1 year ago

@Cc-Hy Thanks for your reply. Could the models, DeepLabV3 and SECOND, be trained in camera coordinate? Accroding to the source code and your paper, the BEV feature maps and 3D bouding boxs are transformed to lidar coodinate, and it's not suitable for my application since the camera pose relative to lidar could be changed in inference as well as the extrinsic parameters can't be estimated. Maybe train the models in camera coordinate could solve this problem?

Cc-Hy commented 1 year ago

@alright1117 I think the anwser is yes, but there should be some modifications. In the training stage, the extrinsic is required, but you should now use them to transform the LiDAR data to the camera coords , and both the BEV features in image branch and LiDAR branch is in camera coords now. And it seems you should also re-train a LiDAR-based teacher model in this setting. And in the inference stage, the extrinsic is not needed, but you still need the intrinsic.

The above is a theoretical analysis, but there may still be some things that need to be changed in the actual operation, such as the setting of the anchors in the head, etc., because these are originally set under the lidar coordinates, and some changes may also need to be made now.

alright1117 commented 1 year ago

@Cc-Hy Thanks for your answer in details.

Cc-Hy commented 1 year ago

@alright1117 It seems there is a simple way to directly use it during inference, that is, you manually set the extrinsic to

\begin{pmatrix}
   0& -1& 0& 0 \\
   0& 0& -1& 0 \\
   1& 0& 0& 0&
  \end{pmatrix}

which just change the direction of the camera coords to the LiDAR one, but actually use the camear coords only and do not need the extrinsic matrix.

alright1117 commented 1 year ago

@Cc-Hy Thank you! Could you share the script for training the teacher model?

Cc-Hy commented 1 year ago

@alright1117 Hi, to train a LiDAR-based teacher model, use

CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --nproc_per_node=2 train.py --launcher pytorch --cfg ${CONFIG_FILE} --tcp_port 16677 

Also, this repo is compatible with OpenPCDet from MMLab, and you can follow the official instruction to find out more.

alright1117 commented 1 year ago

@Cc-Hy Hi, thank's for your reply.

I got an error when training the teacher model with config second_teacher.yaml:

Traceback (most recent call last): File "train.py", line 200, in main() File "train.py", line 152, in main train_model( File "/home/alright/CMKD/tools/train_utils/train_utils.py", line 111, in train_model accumulated_iter = train_one_epoch( File "/home/alright/CMKD/tools/train_utils/train_utils.py", line 47, in train_one_epoch loss, tb_dict, disp_dict = model_func(model, batch) File "/home/alright/CMKD/tools/../pcdet/models/init.py", line 42, in model_func ret_dict, tb_dict, disp_dict = model(batch_dict) File "/home/alright/miniconda3/envs/CMKDK/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/alright/CMKD/tools/../pcdet/models/detectors/second_net.py", line 14, in forward loss, tb_dict, disp_dict = self.get_training_loss() File "/home/alright/CMKD/tools/../pcdet/models/detectors/second_net.py", line 27, in get_training_loss loss_rpn, tb_dict = self.dense_head.get_loss() File "/home/alright/CMKD/tools/../pcdet/models/dense_heads/anchor_head_single_qfl.py", line 143, in get_loss cls_loss, tb_dict = self.get_cls_layer_loss() File "/home/alright/CMKD/tools/../pcdet/models/dense_heads/anchor_head_single_qfl.py", line 97, in get_cls_layer_loss 'rpn_loss_cls': cls_loss.item() RuntimeError: CUDA error: device-side assert triggered

when I replace QualityFocalLoss to QualityFocalLoss_no_reduction in anchor_head_single_qfl.py, the script works fine. I'm not sure is it the correct way to fix the bug?

Cc-Hy commented 1 year ago

This error occurs because the input is not processed by the sigmoid function, which is a small bug here. Using QualityFocalLoss_no_reduction or using torch.sigmoid() to the input can fix this problem.

xiaoxusanheyi commented 1 year ago

此 repo 与 MMLab 的 OpenPCDet 兼容,您可以

作者你好,只要是基于pcdet的激光雷达检测器,我都可以进行训练完,去作为CMKD的预训练老师吗?