Question about the parameters for train and eval

happinesslz / EPNetV2

EPNet++: Cascade Bi-directional Fusion for Multi-Modal 3D Object Detection (TPAMI-2022)

MIT License

56 stars 5 forks source link

Question about the parameters for train and eval #1

Closed Rolandxx7 closed 1 year ago

Rolandxx7 commented 1 year ago

Hi, @happinesslz

Thanks for your great job!

I have two questions for your work:

I have found that the parameter called "MC_MASK_THRES" in your pretrained model is different from that in run_train_and_eval_epnet_plus_plus_car.sh. Could you please explain the reason, and which one is more suitble?
You said you would transfer EPNet++ to Waymo dataset, but the number of RGB images of Waymo dataset is different from the KITTI dataset. Waymo has multiple RGB images as input, so how are you going to deal with this problem?

happinesslz commented 1 year ago

@Rolandxx7 Thanks for your attention. 1.) MC_MASK_THRES is used in MC loss, you could refer to the formula(4) in the original paper of sec3.3. When the proproblities of belonging to the foreground for a position on two modalities are lower than a specified threshold, we think it is a background point. We do not compute MC loss for these background points. In general, it is proper to set MC_MASK_THRES as 0.2 in a dense scene~(64-beam). For a sparser scene, it is necessary to reduce the value MC_MASK_THRES~(e.g., 0.1 or 0.5).
2.) Waymo provides 360-degree LiDAR points and multiple RGB images. And the point-wise correspondence between two modalities can be obtained by the provided multiple projection matrices.

Rolandxx7 commented 1 year ago

@happinesslz Thanks for your detailed explanation！I have one more question to ask.

"PC_AREA_SCOPE" and "RPN_POST_NMS_TOP_N" defined in PED_EPNet_plus_plus.yaml are different from those in CYC_EPNet_plus_plus.yaml and CAR_EPNet_plus_plus.yaml. Could you please explain the reason for me?

Also, I find that if we use the [[-40, 40], [-0.5, 2.5], [0, 70.4]] as the PC_AREA_SCOPE in PED_EPNet_plus_plus.yaml , the number of objects can be increased, so I would like to know why you did this.

happinesslz commented 1 year ago

@Rolandxx7 For Pedestrian, we find setting a small range like offical PointPillars can promise a satisfying performance. Besides, increasing the value of RPN_POST_NMS_TOP_N can improve the recall, which is very important to detection hard Pedestrians.