Open yudongliu97 opened 5 years ago
@bahuangliuhe did you use the same LR you used for Resnet101 model? Typically we observe that for detection LR needed for WSL models are significantly less. So I would suggest doing a LR sweep. Also, are you removing batchnorm layers and replacing them with affine transformation? If not, what batch size are you using for bnatchnorm?
@bahuangliuhe did you use the same LR you used for Resnet101 model? Typically we observe that for detection LR needed for WSL models are significantly less. So I would suggest doing a LR sweep. Also, are you removing batchnorm layers and replacing them with affine transformation? If not, what batch size are you using for bnatchnorm?
Thank you for your reply! I reduce the lr by half or the loss will be infinite. I haven't remove the batchnorm layers,the batch size of image per gpu is set two.
You should remove batch norm layers. Batchsize of 2 is not a good idea at all. Models are trained with batch size of 24.
On Mon, Jul 1, 2019 at 7:59 PM bahuangliuhe notifications@github.com wrote:
@bahuangliuhe https://github.com/bahuangliuhe did you use the same LR you used for Resnet101 model? Typically we observe that for detection LR needed for WSL models are significantly less. So I would suggest doing a LR sweep. Also, are you removing batchnorm layers and replacing them with affine transformation? If not, what batch size are you using for bnatchnorm?
Thank you for your reply! I reduce the lr by half or the loss will be infinite. I haven't remove the batchnorm layers,the batch size of image per gpu is set two.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/facebookresearch/WSL-Images/issues/5?email_source=notifications&email_token=AABE355AYRIUKFSB2CGVTOTP5LABJA5CNFSM4H4XQIDKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODY75A5Q#issuecomment-507498614, or mute the thread https://github.com/notifications/unsubscribe-auth/AABE356LGL5IBUJ43KGFG53P5LABJANCNFSM4H4XQIDA .
You should remove batch norm layers. Batchsize of 2 is not a good idea at all. Models are trained with batch size of 24. … On Mon, Jul 1, 2019 at 7:59 PM bahuangliuhe @.***> wrote: @bahuangliuhe https://github.com/bahuangliuhe did you use the same LR you used for Resnet101 model? Typically we observe that for detection LR needed for WSL models are significantly less. So I would suggest doing a LR sweep. Also, are you removing batchnorm layers and replacing them with affine transformation? If not, what batch size are you using for bnatchnorm? Thank you for your reply! I reduce the lr by half or the loss will be infinite. I haven't remove the batchnorm layers,the batch size of image per gpu is set two. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#5?email_source=notifications&email_token=AABE355AYRIUKFSB2CGVTOTP5LABJA5CNFSM4H4XQIDKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODY75A5Q#issuecomment-507498614>, or mute the thread https://github.com/notifications/unsubscribe-auth/AABE356LGL5IBUJ43KGFG53P5LABJANCNFSM4H4XQIDA .
Thanks and I will have a try.
@bahuangliuhe did you use the same LR you used for Resnet101 model? Typically we observe that for detection LR needed for WSL models are significantly less. So I would suggest doing a LR sweep. Also, are you removing batchnorm layers and replacing them with affine transformation? If not, what batch size are you using for bnatchnorm?
hello,can you show me how to use affine transformation instead of bn layers?
It seems the x101-32*8d backbone is even worse than resnet101 when I experiment on cascade rcnn.