Open hewumars opened 6 years ago
Hi, where did you take PSRoIPool
layer from? A lot of PyTorch implementations of that layer are bugged. Also, they are probably implemented for single image batch and might not work with multiple images per batch.
PSRoI_Align from https://github.com/zengarden/light_head_rcnn
PSRoIPooling from https://github.com/PureDiors/pytorch_RFCN
I set batchsize=1 when trianing light_head_rcnn. the codes seem to be able to work with multiple images per batch,but at least single image per batch can work.
I'm pretty sure PSRoIPooling
in that repo is bugged, see: https://github.com/PureDiors/pytorch_RFCN/issues/4.
light head rcnn model also is not converge use PSRoI_Align from https://github.com/zengarden/light_head_rcnn ,I pull requests:https://github.com/roytseng-tw/Detectron.pytorch/pull/48
I will carefully check the code
@Rizhiy could you share PSRoIPooling ? I compare the code with https://github.com/msracver/Deformable-ConvNets/blob/master/rfcn/operator_cxx/psroi_pooling.cu,the different as shown:
@hewumars I haven't yet got PSRoIPooling to work in PyTorch either.
@Rizhiy How is the PSROI pooling going? I have seen you in many different repos. I think we both focus on the light-head rcnn, right? I don't get the PSRoIpooling in Pytorch either. I think it could be easier to use the code from the official tf implementation.
@YanShuo1992 I'm currently using roytseng-tw/Detectron.pytorch, so far I have focused on getting the best mAP, so didn't put much work in light-head. I will try to let you know if I get something working.
@hewumars @Rizhiy I checked @hewumars 's light head rcnn code. I might find something wrong. I use the PSROIpooling after the res5 or stage5 in resnet50, right? But the RPN is still after the stage4. What do you think?
That's not entirely correct. You need to pass output of res5, through a layer which has k*k*n
filters, where k
is pooling size and n
is arbitrary number of layers (10 in the paper). Then you apply psroipool on that.
I suggest you check https://github.com/msracver/Deformable-ConvNets/blob/f4e163719c8e63cfad7af1caaaab93d373750393/rfcn/symbols/resnet_v1_101_rfcn.py#L785-L798 for reference.
@Rizhiy I will check the official rfcn to see how the rpn and large conv orignized. @roytseng-tw I am trying to implement the light rcnn based on your code. I tried a code from @hewumars and I get RuntimeError: cuda runtime error (2) : out of memory at /opt/conda/conda-bld/pytorch_1524584710464/work/aten/src/THC/generic/THCStorage.cu:58
So that I check the .cu code of psroipooling. I find you commit that do not use rounding in the roialign_kernel.cu. Can you tell me the reason for that or what problem it will lead?
@YanShuo1992 are you meet out of memory after some iterations? i meet same question , i compare psroi code with caffe2 and can't find some things.but i barely use CUDA coding so...... do you solve the problem?
@GYxiaOH Yes. I meet the out of memory when using psroi. I also check the caffe2 code or the tensorflow code and I find nothing. For now, I just give up the psroi and use alignroi.
loss_bbox is not converge.other loss(loss_cls,loss_rpn_cls,loss_bbox) is converge.can I push the code to you for debug.