facebookresearch / DepthContrast

DepthContrast self-supervised learning for 3D
Other
265 stars 34 forks source link

Questions about KITTI fine-tuning experiments #33

Open YurongYou opened 2 years ago

YurongYou commented 2 years ago

Hi! First I would like to thank you for this great work! It is very interesting and I like it a lot! In the meantime, I have two questions about the KITTI fine-tuning experiments:

  1. I found there are multiple random splits for 5-50% KITTI data (https://github.com/zaiweizhang/OpenPCDet/tree/master/data/kitti/split_infos) (this issue). For the numbers reported in the paper, are they the mean AP across multiple random splits?
  2. From the fine-tuning instructions in https://github.com/zaiweizhang/OpenPCDet, if I understand correctly, it seems across 5-50% KITTI data fine-tuning experiments, we are using the same kitti_dbinfos_train.pkl, which is generated from the 100% KITTI train set, for augmentation. Should different splits use different kitti_dbinfos_train.pkl generated from corresponding splits for augmentation?

Thanks in advance!

barzanisar commented 1 year ago

Follow-up questions for finetuning PointRCNN on Kitti,

  1. When finetuning on different splits (5%-50%) do you increase the number of epochs to match the same number of iterations as finetuning on 100% data (to ensure convergence)? For example, if you finetune on 100% data for 80 epochs, did you finetune 5% data for 1600 epochs?
  2. In your paper, you mention using adamW optimizer for finetuning pointrcnn but in your OpenPCDet repo, the finetuning cfg file says adam_onecycle. Which one did you actually use? If you used adam_onecycle, then dropping learning rate at 30 epochs will not do anything since adam_onecycle has its own onecycle scheduler.
  3. In your main.py you don't convert batch norm layers to syncBatchNorm for example: model = torch.nn.SyncBatchNorm.convert_sync_batchnorm(model). Is there a reason why you don't do this?

Thanks in advance!

zaiweizhang commented 1 year ago

@YurongYou For your questions, 1. yes, those are mean across multiple splits. 2. And yes. I use resampled the gt points from the sub sampled pointclouds for the _db_info files. @barzanisar For your questions, 1. I did increase the number of iterations but I think I did not increase it to 1600 epochs for 5%. I tried maybe a bit fewer iterations than that because I found that you don't need those many epochs to get the best performance. 2. I used the adam_onecycle, and for that dropping learning rate, that's the default from OpenPCDet. 3. For that syncBatchNorm, I was following the shuffle batchnorm practice from MocoV2 codebase. So I didn't apply that. There was no particular reason for not using it. You are welcome to try!

barzanisar commented 1 year ago

Thanks @zaiweizhang for replying. How many epochs did you train for on 5% split?

zaiweizhang commented 1 year ago

I think I tried 200 epochs, and the loss converges