Open antoniskef opened 1 year ago
Hi, We pre-trained PointPillars and CenterPoint individually.
May I ask how long it takes to train PointPillars and CenterPoint scene flow backbone? Thanks!
Hi it takes around one day for PointPillars on one 3090. CenterPoint will be longer.
Thank you so much! One more question regarding training centerpoint scene flow backbone on nuScenes, I don’t see any config file ending with ‘flow’ here, am I missing something, https://github.com/emecercelik/ssl-3d-detection/tree/master/mmdetection3d/mmdetection3d/configs/centerpoint? Thanks!
Thank you so much! One more question regarding training centerpoint scene flow backbone on nuScenes, I don’t see any config file ending with ‘flow’ here, am I missing something, https://github.com/emecercelik/ssl-3d-detection/tree/master/mmdetection3d/mmdetection3d/configs/centerpoint? Thanks!
Hi, we did not provide codes for CenterPoint training, sorry for that. However, you can modify the codes according to PointPillars.
I see. Thanks for your reply! Can you elaborate a little bit more for which files I need to modify? From my understanding now, I need to change the piller_encoder.py with the scene flow backbone. Anything else I need to modify? Thanks.
In addition, if I want to train the pointpillars-based scene flow backbone, do I need to change the max_epochs to be 4 from 24 here, https://github.com/emecercelik/ssl-3d-detection/blob/e605ad616278cdd7cb0c6cd5b8479c8c3921c158/mmdetection3d/mmdetection3d/configs/_base_/schedules/schedule_2x.py#L14. Since from the paper, the scene flow auxiliary training is trained for 4 epochs. Thanks again!
pillar_encoder.py
, and recall the ssl cycle loss to calculate the loss.Thanks for your comments! I will try that.
@MingyuLiu1 I get one question regarding the initialization of 3D detectors (i.e, PointPillars) from the scene flow training, since the subsampled points' features used for cycle_loss calculation during scene flow training are from pts_voxel_encoder as here, https://github.com/emecercelik/ssl-3d-detection/blob/e605ad616278cdd7cb0c6cd5b8479c8c3921c158/mmdetection3d/mmdetection3d/mmdet3d/models/detectors/mvx_two_stage.py#L204
So does simply the pts_voxel_encoder’s weights of the PointPillars (i.e., HardVFE) are initialized with the pre-trained weights from the self-supervised scene flow training? In other words, the weights for middle_encoder/pts_backbone are not initialized using scene flow training? Please correct me if I am wrong. Thanks for your clarification.
@MingyuLiu1 I get one question regarding the initialization of 3D detectors (i.e, PointPillars) from the scene flow training, since the subsampled points' features used for cycle_loss calculation during scene flow training are from pts_voxel_encoder as here,
So does simply the pts_voxel_encoder’s weights of the PointPillars (i.e., HardVFE) are initialized with the pre-trained weights from the self-supervised scene flow training? In other words, the weights for middle_encoder/pts_backbone are not initialized using scene flow training? Please correct me if I am wrong. Thanks for your clarification.
Yes, exactly. Because after the middle_encoder/pts_backbone, the features are not point-wise features anymore. In other words, it is not suitable to combine these features with the coordinates of points to calculate scene flow. According to our experiments, utilizing the features after middle_encoder/pts_backbone hurts the final detector performance. However, for the PointGNN model, because it always focuses on the point features, the whole backbone can be used for SSL scene flow training. Hope the explanation is clear :))
Hi, I would like to ask if you are using the same weights (from pointpillars flow scene) in both the pointpillars and centerpoint object detection downstream task. And if yes I would like to ask how is that possible since they use different voxel encoder (HardVFE vs PillarFeatureNet) and different neck (FPN vs SECONDFPN). Thank you very much.