zuokai / roialign

caffe
17 stars 4 forks source link

Memory consuming? #6

Closed SingL3 closed 6 years ago

SingL3 commented 6 years ago

Hello, I am trying to replace ROI Pooling with ROI Align. In R-CNN method, there are several parameters to set for RPN, ie, RPN_PRE_NMS_TOP_M and RPN_POST_NMS_TOP_N. In training phase, they are set to 12,000 and 2,000, respectively. And the GPU memory is just enough for training. However, when I trying to do testing, these two parameters are set to 6,000 and 1,000, keeping other setting unchanged, and the memory just ran out. I have to reduce the latter one to 800 for testing. Do this implementation need more memory in testing than that in training?

zuokai commented 6 years ago

there is another layer when you do training, named rpnproposaltargetlayer, this layer select some positive and negtive box for rcnn, so when you are training, the rcnn received not 2000 boxes, it is set by the parameter named 'batch_size' of rpnproposaltargetlayer. this value is much less than 2000, maybe is 128 or 256 or 512. In some implementations of faster-rcnn,the box to rcnn(when training)is less than 'batch_size', when there are not enough postive boxes(>0.5) and negtive boxes(0.1<iou<0.49). In other implementations like ms-cnn, the box to rcnn(when training)is equal to 'batch_size', because when there are not enough postive boxes(>0.5) and negtive boxes(0.1<iou<0.49), it will random generating negtive boxes. so when training, the num of boxes to the rcnn is maybe equal or less than N * batch_size, the N refers to the batch_size of the images, the batch_size is the parameter of rpnproposaltargetlayer. but when you test, there is no rpnproposaltargetlayer, the num of boxes to the rcnn equals to the output boxes of proposal layer. so you must reduce the num of boxes when you can testing. I think you can print the shape of bottomdata of the roialign layer(or roipooling layer), to see whether the number of the boxes is larger or small when training and testing.

SingL3 commented 6 years ago

You mean that in training phase, it will propose 2,000 proposal but most of them will be filtered out and only batch_size of them will forward in the classification part, but in testing phase, all the proposals will forward to the classification part. Right? When I using ROI Pooling, for testing, it only takes me 7G+ GPU memory with 1,000 proposals. For ROI Align, it takes me 11G with 800 proposals. So do ROI Align need much more memory than ROI Pooling?

zuokai commented 6 years ago

yeah, I save the indexes and grads for backward, but it's not necessary when you do test, so maybe you do not need to store bili_idx and bili_w when test, it's my fault.

SingL3 commented 6 years ago

Thank you for your answers and patience!