Epiphqny / CondInst

Conditional Convolutions for Instance Segmentation, achives 37.1mAP on coco val
https://arxiv.org/abs/2003.05664
139 stars 15 forks source link

About the gpu memory issue #9

Open NNNNAI opened 3 years ago

NNNNAI commented 3 years ago

Thank you for sharing the code!!!

NNNNAI commented 3 years ago

I got some problem during training: the GPU memory usage keep rising during training. Here is the log information: [01/07 16:42:57 d2.utils.events]: eta: 6:37:20 iter: 339 total_loss: 3.58 loss_fcos_cls: 1.22 loss_fcos_loc: 0.8122 loss_fcos_ctr: 0.6831 loss_mask: 0.8265 time: 0.3924 data_time: 0.0029 lr: 0.0033966 max_mem: 8132M [01/07 16:43:04 d2.utils.events]: eta: 6:35:49 iter: 359 total_loss: 3.494 loss_fcos_cls: 1.177 loss_fcos_loc: 0.7754 loss_fcos_ctr: 0.7057 loss_mask: 0.8369 time: 0.3902 data_time: 0.0029 lr: 0.0035964 max_mem: 8132M [01/07 16:43:09 d2.utils.events]: eta: 6:31:44 iter: 379 total_loss: 3.4 loss_fcos_cls: 1.115 loss_fcos_loc: 0.8066 loss_fcos_ctr: 0.6745 loss_mask: 0.7531 time: 0.3840 data_time: 0.0027 lr: 0.0037962 max_mem: 8132M [01/07 16:43:17 d2.utils.events]: eta: 6:28:50 iter: 399 total_loss: 3.448 loss_fcos_cls: 1.152 loss_fcos_loc: 0.7724 loss_fcos_ctr: 0.664 loss_mask: 0.8222 time: 0.3850 data_time: 0.0035 lr: 0.003996 max_mem: 8132M [01/07 16:43:26 d2.utils.events]: eta: 6:31:25 iter: 419 total_loss: 3.57 loss_fcos_cls: 1.316 loss_fcos_loc: 0.8286 loss_fcos_ctr: 0.6871 loss_mask: 0.8104 time: 0.3877 data_time: 0.0030 lr: 0.0041958 max_mem: 8132M [01/07 16:43:41 d2.utils.events]: eta: 6:33:37 iter: 439 total_loss: 3.469 loss_fcos_cls: 1.116 loss_fcos_loc: 0.7849 loss_fcos_ctr: 0.6835 loss_mask: 0.8386 time: 0.4036 data_time: 0.0041 lr: 0.0043956 max_mem: 9343M [01/07 16:43:50 d2.utils.events]: eta: 6:34:02 iter: 459 total_loss: 3.505 loss_fcos_cls: 1.16 loss_fcos_loc: 0.7706 loss_fcos_ctr: 0.6652 loss_mask: 0.8513 time: 0.4045 data_time: 0.0032 lr: 0.0045954 max_mem: 9343M

The max_mem term increased from 8132 to 9343.Have you ever encountered such a situation?