Closed heiyuxiaokai closed 5 years ago
@heiyuxiaokai Did FCOS run out of memory?
@tianzhi0549 No,Maybe the iou caculate process of a special image(with many boxes) need a lot of memory. FCOS haven't this process. Did your GPU is 12G where you train this model for (4 gpu, batch 8)? The data I use is remote sensing image, which may have many object.
@heiyuxiaokai our GPUs are 32GB V100.
@tianzhi0549 So I should set batch to 2. You train batch 8 of 4 GPU(V100). Why don't you use a larger batch for 32g GPU?
@heiyuxiaokai We use 16 images in a mini-batch for a fair comparison.
Too many GT Boxes. It was explained there. https://github.com/facebookresearch/maskrcnn-benchmark/issues/18
你的解决了吗?我也是在计算loss的时候出错了,我的batchsize是2都错。
@dreamhighchina Reference there: https://github.com/facebookresearch/maskrcnn-benchmark/issues/884
File "/home/fw/Softwares/FCOS/maskrcnn_benchmark/structures/boxlist_ops.py", line 84, in boxlist_iou wh = (rb - lt + TO_REMOVE).clamp(min=0) # [N,M,2] RuntimeError: CUDA out of memory. Tried to allocate 1.56 GiB (GPU 1; 11.92 GiB total capacity; 7.99 GiB already allocated; 1.20 GiB free; 1.74 GiB cached)
It seems the iou caculate' problem. I use retinanet, batch 4, 2 titan x(12G) The GPU use of beginning: Should I set the batch to 2?