x101-32*8d doesn't work well when I put it in cascade rcnn in mmdetection

yudongliu97 commented 5 years ago

It seems the x101-32*8d backbone is even worse than resnet101 when I experiment on cascade rcnn.

dkm2110 commented 5 years ago

@bahuangliuhe did you use the same LR you used for Resnet101 model? Typically we observe that for detection LR needed for WSL models are significantly less. So I would suggest doing a LR sweep. Also, are you removing batchnorm layers and replacing them with affine transformation? If not, what batch size are you using for bnatchnorm?

yudongliu97 commented 5 years ago

@bahuangliuhe did you use the same LR you used for Resnet101 model? Typically we observe that for detection LR needed for WSL models are significantly less. So I would suggest doing a LR sweep. Also, are you removing batchnorm layers and replacing them with affine transformation? If not, what batch size are you using for bnatchnorm?

Thank you for your reply! I reduce the lr by half or the loss will be infinite. I haven't remove the batchnorm layers,the batch size of image per gpu is set two.

dkm2110 commented 5 years ago

You should remove batch norm layers. Batchsize of 2 is not a good idea at all. Models are trained with batch size of 24.

On Mon, Jul 1, 2019 at 7:59 PM bahuangliuhe notifications@github.com wrote:

@bahuangliuhe https://github.com/bahuangliuhe did you use the same LR you used for Resnet101 model? Typically we observe that for detection LR needed for WSL models are significantly less. So I would suggest doing a LR sweep. Also, are you removing batchnorm layers and replacing them with affine transformation? If not, what batch size are you using for bnatchnorm?

Thank you for your reply! I reduce the lr by half or the loss will be infinite. I haven't remove the batchnorm layers,the batch size of image per gpu is set two.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/facebookresearch/WSL-Images/issues/5?email_source=notifications&email_token=AABE355AYRIUKFSB2CGVTOTP5LABJA5CNFSM4H4XQIDKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODY75A5Q#issuecomment-507498614, or mute the thread https://github.com/notifications/unsubscribe-auth/AABE356LGL5IBUJ43KGFG53P5LABJANCNFSM4H4XQIDA .

yudongliu97 commented 5 years ago

You should remove batch norm layers. Batchsize of 2 is not a good idea at all. Models are trained with batch size of 24. … On Mon, Jul 1, 2019 at 7:59 PM bahuangliuhe @.***> wrote: @bahuangliuhe https://github.com/bahuangliuhe did you use the same LR you used for Resnet101 model? Typically we observe that for detection LR needed for WSL models are significantly less. So I would suggest doing a LR sweep. Also, are you removing batchnorm layers and replacing them with affine transformation? If not, what batch size are you using for bnatchnorm? Thank you for your reply! I reduce the lr by half or the loss will be infinite. I haven't remove the batchnorm layers,the batch size of image per gpu is set two. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#5?email_source=notifications&email_token=AABE355AYRIUKFSB2CGVTOTP5LABJA5CNFSM4H4XQIDKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODY75A5Q#issuecomment-507498614>, or mute the thread https://github.com/notifications/unsubscribe-auth/AABE356LGL5IBUJ43KGFG53P5LABJANCNFSM4H4XQIDA .

Thanks and I will have a try.

yudongliu97 commented 5 years ago

@bahuangliuhe did you use the same LR you used for Resnet101 model? Typically we observe that for detection LR needed for WSL models are significantly less. So I would suggest doing a LR sweep. Also, are you removing batchnorm layers and replacing them with affine transformation? If not, what batch size are you using for bnatchnorm?

hello，can you show me how to use affine transformation instead of bn layers?

facebookresearch / WSL-Images

x101-32*8d doesn't work well when I put it in cascade rcnn in mmdetection #5