tianweiy / CenterPoint

MIT License
1.89k stars 457 forks source link

Any guidance for the usage of the second stage? #97

Closed Son-Goku-gpu closed 3 years ago

Son-Goku-gpu commented 3 years ago

Hi, @tianweiy Great work. For nuScenes, I find the two_stage.py file for the second-stage backbone, while didn't find corresponding network config file, data processing, evaluation code, and calling function, as well as introduction in readme file. Could you show how to use the second stage training and evalucation with your codebase? Thanks!

tianweiy commented 3 years ago

two_stage basically doesn't work for nuScenes. this is mentioned in our paper

Two-stage refinement does not bring an improvement over the single-stage CenterPoint model on nuScenes in our experiments. We think the reason is that the nuScenesdataset uses 32 lanes Lidar, which produces about 30k Lidarpoints per frame, about 1/6 of the number of points in the Waymon dataset, which limits the potential improvements of two-stage refinement. Similar results have been observed in previous two-stage methods like PointRCNN [45] and PV-RCNN [44].

I don't know what is the exact reason but to my knowledge, I haven't seen any papers that achieve any improvements with sampling-based two-stage refinement on nuScenes.

tianweiy commented 3 years ago

We mainly evaluate two-stage approaches on Waymo. Are you interested in this part?

Son-Goku-gpu commented 3 years ago

@tianweiy As I find the second stage is similar to another work (BorderDet, ECCV 2020, Megvii), I'm not sure while I guess BorderDet may inpsire the further improvement. Also, the second stage doesn't work on nuScenes, (1) do you think it is related with the complex backbone network? I saw you used the ResNet-like structure SparseConvNet and very deep neck network compared with SECOND in Det3D, so it may be hard to extract low-level geometric features for the refinement. Besides, (2) how about the results with the SparseConvNet and NECK of SECOND? (3) and may I ask you how much time and how many GPUs do you spend on a complete training of CenterPoint on the train split? (4) With so many samples, do you build a subset (like PV-RCNN subsampling 20% data on waymo) to fasten the validation of an idea? How about the effect? Thanks!

tianweiy commented 3 years ago

do you think it is related with the complex backbone network? I saw you used the ResNet-like structure SparseConvNet and very deep neck network compared with SECOND in Det3D, so it may be hard to extract low-level geometric features for the refinement

Yeah, but PointPillar also doesn't work on nuScenes and it is super simple pipeline, so...

how about the results with the SparseConvNet and NECK of SECOND

We just use the CBGS's backbone. I haven't tried second backbone but according to CBGS's paper, it is a few point lower

and may I ask you how much time and how many GPUs do you spend on a complete training of CenterPoint on the train split

nuScenes or Waymo? nuScenes it is 4 GPU one day for PP, two day for VoxelNet. Waymo, it is quite long. Maybe 3 days for PP and 4 days or so for VoxelNet. We only tried full dataset training before the submission. As PP on Waymo is drastically worse than VoxelNet, I mainly use VoxelNet after the submission.

With so many samples, do you build a subset (like PV-RCNN subsampling 20% data on waymo) to fasten the validation of an idea?

As I said, I only tried full dataset training before submission. After the deadline, I have tried less training epochs https://github.com/tianweiy/CenterPoint/tree/master/configs/waymo#ablations-for-training-schedule

It takes 28 hours for 12 epochs and is 0.8mAPH worse than single stage baseline with 36 epochs. For the second stage, currently it is quite slow. But I guess if you precompute and save those features and then only train those few FC layers, it will be an hour or so. I will try to support this in the near future.

Someone told me that 1/5 subsampled is better. I will explore more training schedule details after ICCV submission. We have some plans to improve the codebase and figure out good schedules and make Waymo doable for the general public.

tianweiy commented 3 years ago

the second stage is similar to another work (BorderDet, ECCV 2020, Megvii), I'm not sure while I guess BorderDet may inpsire the further improvement

I know this paper. But I think the 3d setting is a bit different. The message we want to convey is that features are actually not that important (dense sampling, Borderdet, PVRCNN, etc..). It seems that at least on Waymo, a few BEV features (like 5 point in our case) could already bring significant improvements and extra features don't help too much. The main improvement comes from positive / negative sampling in two stage training and the IoU prediction branch.

Son-Goku-gpu commented 3 years ago

@tianweiy That's great! Hope to discuss more after ICCV deadline. Thanks for your time and detailed explanation!

tianweiy commented 3 years ago

close for now. Feel free to create another issue or send me an email for discussion about the second stage.

XuyangBai commented 3 years ago

the second stage is similar to another work (BorderDet, ECCV 2020, Megvii), I'm not sure while I guess BorderDet may inpsire the further improvement

I know this paper. But I think the 3d setting is a bit different. The message we want to convey is that features are actually not that important (dense sampling, Borderdet, PVRCNN, etc..). It seems that at least on Waymo, a few BEV features (like 5 point in our case) could already bring significant improvements and extra features don't help too much. The main improvement comes from positive / negative sampling in two stage training and the IoU prediction branch.

Hi @tianweiy As you said the main improvement comes from pos/meg sampling and IoU prediction branch, have you tried strategies other than randomly sampling 128 boxes with 1:1 ratio? Will more complexted label assignment strategies such as ATSS or AutoAssign helps?

tianweiy commented 3 years ago

Actually, I changed my mind a little bit. I recently played around with tusimple's lidar rcnn paper and it can still give 1 map on top of my two stage result. So both feature and What I mentioned above matters. I have not tried other assignment schedule(to my knowledge atss / autoassign mainly deals with first stage assignment ?)

XuyangBai commented 3 years ago

Thanks a lot, I will try lidar rcnn later. I think I mix up the label assignment and the sampling strategy for your second stage, since they look similar ( choose pos and neg IoU threshold to define positive proposal and negative proposal). And such manually chosen IoU thresholds are not needed in one-stage CenterNet or CenterPoint, which is also one of the advantages of CenterPoint. So why you decide to find the pos/neg proposals by iou instead of using a simpler strategy such as deciding by the heatmap of the first stage? Is it necessary for 1:1 ratio of pos and neg proposals?

tianweiy commented 3 years ago

I just follow pvrcnn's strategy and code for this part. I haven't got time to tune it yet.

XuyangBai commented 3 years ago

Thanks a lot for the discussion!

hughlee815 commented 3 years ago

We mainly evaluate two-stage approaches on Waymo. Are you interested in this part?

yes!! i was trying to train the 2-stage model with python -m torch.distributed.launch --nproc_per_node=8 configs/waymo/voxelnet/two_stage/waymo_centerpoint_voxelnet_two_sweep_two_stage_bev_5point_ft_6epoch_freeze_with_vel.py then nothing happens but Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. then the program is over?

tianweiy commented 3 years ago

then nothing happens <- not sure what this is and I never met this.

please paste the full log from start to finish.

turboxin commented 3 years ago

Actually, I changed my mind a little bit. I recently played around with tusimple's lidar rcnn paper and it can still give 1 map on top of my two stage result. So both feature and What I mentioned above matters. I have not tried other assignment schedule(to my knowledge atss / autoassign mainly deals with first stage assignment ?)

Hi @tianweiy , I‘m wondering what setting are you using from lidar-rcnn to get 1 map gain. Thanks a lot!

zehuichen123 commented 3 years ago

Actually, I changed my mind a little bit. I recently played around with tusimple's lidar rcnn paper and it can still give 1 map on top of my two stage result. So both feature and What I mentioned above matters. I have not tried other assignment schedule(to my knowledge atss / autoassign mainly deals with first stage assignment ?)

Hi @tianweiy ,I'am also wondering do you use lidar rcnn for regressing the attribute of the box, such as velocity? or simply refine the box coordinates?