WangYueFt / detr3d

MIT License
745 stars 140 forks source link

Train detr3d_vovnet_train exceed the memory of 4*RTX3090 #21

Open synsin0 opened 2 years ago

synsin0 commented 2 years ago

Environment: 4xRTX3090. Failure: train detr3d with resnet101 backbone dominates each card with 21GB memory. Train detr3d with vovnet backbone exceeds the memory limit. image_per_gpu is set to 1. I read from your paper that your experiment uses 8xRTX3090. How should I adjust for adaption of my training process?

a1600012888 commented 2 years ago

Hi synsin0. For vovnet backbone, it is too large to fit in 3090. If you want to fit it in 3090, you can try:

  1. fp16
  2. memory checkpoint Training Deep Nets with Sublinear Memory Cost, pytorch provide a checkpoint implementation: torch.utils.checkpoint.checkpoint, see https://pytorch.org/docs/stable/checkpoint.html?highlight=checkpoint
  3. Freeze some layers of Vovnet. e.g. first stage, etc.
cgl-cell commented 1 year ago

Environment: 4xRTX3090. Failure: train detr3d with resnet101 backbone dominates each card with 21GB memory. Train detr3d with vovnet backbone exceeds the memory limit. image_per_gpu is set to 1. I read from your paper that your experiment uses 8xRTX3090. How should I adjust for adaption of my training process?

Have you solved it?