xuebinqin / U-2-Net

The code for our newly accepted paper in Pattern Recognition 2020: "U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection."
Apache License 2.0
8.52k stars 1.47k forks source link

How to increase model capacity for training on a larger dataset? #53

Closed daniyalDE closed 4 years ago

daniyalDE commented 4 years ago

First of all thanks for the amazing work on U-2-net. Now i am trying to train the model from scratch on my own dataset of 60k images which is larger than your dataset. I would like to know how i can increase the model capacity to be able to train on such a dataset.

I have considered replacing the standard rebnconv blocks with residuals as suggested in another issue. What other options i could try? I understand that i need to make the architecture deeper, does this mean that i should make RSU-8 or RSU-9 blocks by adding more convolution layers?

xuebinqin commented 4 years ago

Thanks for your interest. You can try following ideas: (1) increase the filter numbers of each layer or add more layers in the basic bn_relu_conv module, (2) remove some of the dense supervision, (3) try to build RSU-8 or RSU-9, (4) input resolution also matters, etc.

On Tue, Aug 11, 2020 at 6:32 AM Daniyal Arshad notifications@github.com wrote:

First of all thanks for the amazing work on U-2-net. Now i am trying to train the model from scratch on my own dataset which is 60k images which is larger than your dataset. I would like to know how i can increase the model capacity to be able to train on such a dataset.

I have considered replacing the standard rebnconv blocks with residuals as suggested in another issue. What other options i could try? I understand that i need to make the architecture deeper, does this mean that i should make RSU-8 or RSU-9 blocks by adding more convolution layers?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/NathanUA/U-2-Net/issues/53, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORKJ5UXPQJVYO5AW2RTSAE26NANCNFSM4P27TBGA .

-- Xuebin Qin PhD Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

shgidi commented 4 years ago

@daniyalDE Hi Daniel, I'm interested in similar tasks as well. Why do you assume that the original model doesn't have the capacity of such a task? How do you determine that the model was "maxed out" on the 10K dataset it was trained on?

daniyalDE commented 4 years ago

@NathanUA thanks for the feedback. One last thing regarding (4) the input resolution, from what i understand the training dataloader always rescales the input images to 320x320, so if i want to train with higher resolution images should i change the rescale ratio to a higher value?

daniyalDE commented 4 years ago

@shgidi i have tried training on my dataset and the loss/accuracy stalls after a while which might be that the current model is not complex enough to learn the features of my data which is very different from the datasets that they trained on originally.

xuebinqin commented 4 years ago

yes, at the same time, you may also need to modify the random_crop size correspondingly.

On Wed, Aug 12, 2020 at 4:07 AM Daniyal Arshad notifications@github.com wrote:

@NathanUA https://github.com/NathanUA thanks for the feedback. One last thing regarding (4) the input resolution, from what i understand the training dataloader always rescales the input images to 320x320, so if i want to train with higher resolution images should i change the rescale ratio to a higher value?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NathanUA/U-2-Net/issues/53#issuecomment-672781022, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORP2FJJD4MBFWL5KH4LSAJSVTANCNFSM4P27TBGA .

-- Xuebin Qin PhD Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

EricLe-dev commented 4 years ago

Thanks for your interest. You can try following ideas: (1) increase the filter numbers of each layer or add more layers in the basic bn_relu_conv module, (2) remove some of the dense supervision, (3) try to build RSU-8 or RSU-9, (4) input resolution also matters, etc. On Tue, Aug 11, 2020 at 6:32 AM Daniyal Arshad @.***> wrote: First of all thanks for the amazing work on U-2-net. Now i am trying to train the model from scratch on my own dataset which is 60k images which is larger than your dataset. I would like to know how i can increase the model capacity to be able to train on such a dataset. I have considered replacing the standard rebnconv blocks with residuals as suggested in another issue. What other options i could try? I understand that i need to make the architecture deeper, does this mean that i should make RSU-8 or RSU-9 blocks by adding more convolution layers? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#53>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORKJ5UXPQJVYO5AW2RTSAE26NANCNFSM4P27TBGA . -- Xuebin Qin PhD Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

Can you please tell me how to disable the side output? I tried disabling them by commenting them out but it did not work. Thank you so much.

xuebinqin commented 4 years ago

The simplest way to disable that is to comment the line 32 - line 37 in the u2net_train.py out. And change line 39 to: loss = loss0.

On Fri, Sep 4, 2020 at 6:31 AM EricLe-dev notifications@github.com wrote:

Thanks for your interest. You can try following ideas: (1) increase the filter numbers of each layer or add more layers in the basic bn_relu_conv module, (2) remove some of the dense supervision, (3) try to build RSU-8 or RSU-9, (4) input resolution also matters, etc. … <#m6985007689474618957> On Tue, Aug 11, 2020 at 6:32 AM Daniyal Arshad @.***> wrote: First of all thanks for the amazing work on U-2-net. Now i am trying to train the model from scratch on my own dataset which is 60k images which is larger than your dataset. I would like to know how i can increase the model capacity to be able to train on such a dataset. I have considered replacing the standard rebnconv blocks with residuals as suggested in another issue. What other options i could try? I understand that i need to make the architecture deeper, does this mean that i should make RSU-8 or RSU-9 blocks by adding more convolution layers? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#53 https://github.com/NathanUA/U-2-Net/issues/53>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORKJ5UXPQJVYO5AW2RTSAE26NANCNFSM4P27TBGA . -- Xuebin Qin PhD Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

Can you please tell me how to disable the side output? I tried disabling them by commenting them out but it did not work. Thank you so much.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NathanUA/U-2-Net/issues/53#issuecomment-687113757, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORIKEUSARWHTI4QKJYTSEDMZ7ANCNFSM4P27TBGA .

-- Xuebin Qin PhD Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

EricLe-dev commented 4 years ago

Thank you so much for your reply. I have a very quick question since I am a big fan of your previous work - BASNet. Does this shares any similarity with this (line 47 - 53 in basnet_train.py)?

As I also shall this kind of behavior with BASNet. Your quick response is appreciated.

xuebinqin commented 4 years ago

Thanks for your interests. It is a bit different. You may have to keep both loss0 and loss1 because loss0 is the refined prediction of loss1.

On Mon, Sep 7, 2020 at 9:52 PM EricLe-dev notifications@github.com wrote:

Thank you so much for your reply. I have a very quick question since I am a big fan of your previous work BASNet. Does this share any similarity with this https://github.com/NathanUA/BASNet/blob/6355e62eeb20fa7a033092e33b2d4d87e879b0cc/basnet_train.py#L47 (line 47 - 53 in basnet_train.py)?

As I also shall this kind of behavior with BASNet. Your quick response is appreciated.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NathanUA/U-2-Net/issues/53#issuecomment-688603080, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORNCOBBTSHKGVBFPEQ3SEWTA3ANCNFSM4P27TBGA .

-- Xuebin Qin PhD Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/