Result reproduction in update version.

skyhehe123 / SA-SSD

SA-SSD: Structure Aware Single-stage 3D Object Detection from Point Cloud (CVPR 2020)

492 stars 106 forks source link

Result reproduction in update version. #41

Open SlingHe opened 4 years ago

SlingHe commented 4 years ago

Hi, I notice this repo is updated in the training process and optimization. Use the previous commit, I can reproduce the 84.3 in Car AP40@0.7. However, the performance in the master branch is poor using MMDistributedDataParallel with 2GPU. Environments: pytorch1.1 cuda10.0

I only add: key_rename = 'module.' + key in line166 init.py of train_utils->init.py to update the model_state_disk name_list. Because the original code model = MMDataParallel(model, device_ids=[0]). Do you know what happened?

SlingHe commented 4 years ago

@skyhehe123 Could you please offer me some suggestions? Download and git checkout to history commits is okay while the latest commit seems poor performance.

Divadi commented 4 years ago

I'm getting results of Car AP@0.70, 0.70, 0.70: bbox AP:95.91, 80.27, 77.69 bev AP:93.43, 82.12, 79.92 3d AP:89.14, 70.87, 66.54 aos AP:95.89, 80.17, 77.48 with the latest commit. I had changed https://github.com/skyhehe123/SA-SSD/blob/9bb2ef4aecc7206ea935977d45781267b8a15001/mmdet/models/single_stage_heads/ssd_rotate_head.py#L322 to opp_labels = (box_preds[..., -1] > 0) ^ dir_labels.bool() because I was getting an error

@SlingHe do you know which previous commit yielded good performance?

SlingHe commented 4 years ago

@Divadi Hi, I try the commit version(24c9149) and I believe any version before rewriting the optimizer and scheduler are ok. The poor results only occur after the scheduler is modified. I further tested torch.optim.lr_scheduler.OneCycleLR in PyTorch1.5 and get the same poor results as the latest commit. Still confused about this ...

Divadi commented 4 years ago

@SlingHe Huh, I tried that exact commit and ended up with: Car AP@0.70, 0.70, 0.70: bbox AP:93.71, 80.44, 75.87 bev AP:93.64, 80.62, 78.19 3d AP:86.67, 69.11, 66.18 aos AP:93.64, 80.29, 75.61

I tested on Pytorch 1.3, with spconv 1.0, mmcv=0.4.3; do you have any idea what might cause a gap like this?

Edit: I have reproduced the results by using another commit of spconv

vehxianfish commented 3 years ago

Hi,which version use to reproduce result?any do other operation?Thank you for your replay~!