Memory consumption buildup

facebookresearch / maskrcnn-benchmark

Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch.

MIT License

9.3k stars 2.49k forks source link

Memory consumption buildup #606

Open shimen opened 5 years ago

shimen commented 5 years ago

❓ Questions and Help

When training R-50-FPN for Mask, the is a memory buildup that I can't explain. Is this normal?

Screenshot from 2019-03-26 18-18-51

RutenburgIG commented 5 years ago

@shimen same for me, memory consumption always increases during train loop without no objective reason. I had changed Pytorch 1.0 to pytorch-nightly as it's discussed in #182 and added torch.cuda.empty_cache() after batch processing as it's recommended here https://discuss.pytorch.org/t/how-to-debug-causes-of-gpu-memory-leaks/6741/19, but it doesn't solve this issue.

chengyangfu commented 5 years ago

This could be related to your dataset. How many images in your training set and what's the batch size? If this situation happens before the end of the first epoch, this is normal. Because in your dataset, maybe there exist few images whose respect ratio is much different than other images. This will cause the PyTorch to reallocate the larger memory.

RutenburgIG commented 5 years ago

@chengyangfu, I appreciate your prompt response. As you described, it's happening before the end of the first epoch, which is 10k iters for me. However, I cropped all inputs so they all have the exact same height to width ratio (2.). Or you mean absolute input sizes?

shimen commented 5 years ago

@chengyangfu I use absolute input size of 800x800 with batch of 2. I don't think it happens due to this. I pretty sure that when I didn't used FPN this didn't happened. I will try to make an experiment without FPN and will update about the results.

qianyizhang commented 5 years ago

@RutenburgIG @shimen please keep in mind your gt anchors and masks are also copied to GPU and their counts may vary between images.

chengyangfu commented 5 years ago

There are some uncertainties of memory in the Mask R-CNN. First, the input image size. In contrast to YOLO and SSD, Faster R-CNN/Mask R-CNN keep the image aspect ratio. So, the input batch is slightly different in each iteration. Otherwise, you can pre-process them or add some extra data augmentation to make the input fixed.

Another is the second-stage which is mentioned by @qianyizhang. The box-head will take (N+Gt) proposals from the RPN layer. In FPN architecture, N is set by MODEL.RPN.FPN_POST_NMS_TOP_N_TEST. N means maximal number. So in some cases, the number of proposals is fewer than this. The numbers of proposals which were sent to the box-head are changing in the training and related to the performance of RPN.

So, memory consumption is very dynamic during training. It is also possible to observe the memory consumption increases after 10k iteration during training.

zimenglan-sysu-512 commented 5 years ago

hi @shimen does ur dataset have large number of gt bboxes, which will need more memory to do nms for rpn and rcnn stage.

shimen commented 5 years ago

hi @zimenglan-sysu-512 I just figured out that I have several images with large amount of gt bboxes > 50. I'll try to remove them from training and will update if this was the problem.

shimen commented 5 years ago

So i made the change and removed the images with large amount of gt bboxes > 50. Now it seems that the memory issue resolved. There is still a buildup, but it is small, just 200mb. What can be done to use images with large amount of annotations? Can we do nms and iou in batches?

chengyangfu commented 5 years ago

Running NMS in batches does not make sense to me. NMS is to remove the duplicated detections in an image. Therefore, this part is done in each image independently.

zimenglan-sysu-512 commented 5 years ago

hi @shimen u can have a look at #120