xuebinqin / U-2-Net

The code for our newly accepted paper in Pattern Recognition 2020: "U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection."
Apache License 2.0
8.47k stars 1.46k forks source link

High resolution human segmentation #209

Open zichengf1997 opened 3 years ago

zichengf1997 commented 3 years ago

Thanks for your great work! I'm trying to train u2net for human segmentation, but the image size for further inference is 1920*2560. The given pretrained model does not perform well. Would you please give me some suggestions about the training strategy? (like training image selection or net architecture modification)

xuebinqin commented 3 years ago

You can try cascade the two U2Nets(heavy + light) together. For the heavy U2Net, you can set the input resolution to 512x512. For the light U2Net (you can further reduce the filter numbers or layer numbers to build even smaller u2nets), you set the input resolution to a higher. These cascaded models probably works better than single stage model.

On Tue, May 25, 2021 at 5:21 PM zichengf1997 @.***> wrote:

Thanks for your great work! I'm trying to train u2net for human segmentation, but the image size for further inference is 1920*2560. The given pretrained model does not perform well. Would you please give me some suggestions about the training strategy? (like training image selection or net architecture modification)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/xuebinqin/U-2-Net/issues/209, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORJIZZNFIULR7EZEEMDTPOP43ANCNFSM45PLLOJQ .

-- Xuebin Qin PhD Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

zichengf1997 commented 3 years ago

You can try cascade the two U2Nets(heavy + light) together. For the heavy U2Net, you can set the input resolution to 512x512. For the light U2Net (you can further reduce the filter numbers or layer numbers to build even smaller u2nets), you set the input resolution to a higher. These cascaded models probably works better than single stage model. On Tue, May 25, 2021 at 5:21 PM zichengf1997 @.**> wrote: Thanks for your great work! I'm trying to train u2net for human segmentation, but the image size for further inference is 19202560. The given pretrained model does not perform well. Would you please give me some suggestions about the training strategy? (like training image selection or net architecture modification) — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#209>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORJIZZNFIULR7EZEEMDTPOP43ANCNFSM45PLLOJQ . -- Xuebin Qin PhD Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

Thanks for your suggestion! Does the cascade point to using heavy u2net's output as light u2net's input?

xuebinqin commented 3 years ago

yes, you can fix the heavy u2net and train the light one. But it depends on the results. If you are not satisfied with the edge accuracy, it probably works. There are different strategies you can try. I can't give specific details without seeing the failure cases.

On Wed, May 26, 2021 at 5:18 AM zichengf1997 @.***> wrote:

You can try cascade the two U2Nets(heavy + light) together. For the heavy U2Net, you can set the input resolution to 512x512. For the light U2Net (you can further reduce the filter numbers or layer numbers to build even smaller u2nets), you set the input resolution to a higher. These cascaded models probably works better than single stage model. … <#m-1600261363481457098> On Tue, May 25, 2021 at 5:21 PM zichengf1997 @.**> wrote: Thanks for your great work! I'm trying to train u2net for human segmentation, but the image size for further inference is 19202560. The given pretrained model does not perform well. Would you please give me some suggestions about the training strategy? (like training image selection or net architecture modification) — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#209 https://github.com/xuebinqin/U-2-Net/issues/209>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORJIZZNFIULR7EZEEMDTPOP43ANCNFSM45PLLOJQ . -- Xuebin Qin PhD Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

Thanks for your suggestion! Does the cascade point to using heavy u2net's output as light u2net's input?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/xuebinqin/U-2-Net/issues/209#issuecomment-848385110, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORMA3V5DJXARNOXSD53TPRD4RANCNFSM45PLLOJQ .

-- Xuebin Qin PhD Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

zichengf1997 commented 3 years ago

yes, you can fix the heavy u2net and train the light one. But it depends on the results. If you are not satisfied with the edge accuracy, it probably works. There are different strategies you can try. I can't give specific details without seeing the failure cases. On Wed, May 26, 2021 at 5:18 AM zichengf1997 @.> wrote: You can try cascade the two U2Nets(heavy + light) together. For the heavy U2Net, you can set the input resolution to 512x512. For the light U2Net (you can further reduce the filter numbers or layer numbers to build even smaller u2nets), you set the input resolution to a higher. These cascaded models probably works better than single stage model. … <#m-1600261363481457098> On Tue, May 25, 2021 at 5:21 PM zichengf1997 @.> wrote: Thanks for your great work! I'm trying to train u2net for human segmentation, but the image size for further inference is 1920*2560. The given pretrained model does not perform well. Would you please give me some suggestions about the training strategy? (like training image selection or net architecture modification) — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#209 <#209>>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORJIZZNFIULR7EZEEMDTPOP43ANCNFSM45PLLOJQ . -- Xuebin Qin PhD Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/ Thanks for your suggestion! Does the cascade point to using heavy u2net's output as light u2net's input? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#209 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORMA3V5DJXARNOXSD53TPRD4RANCNFSM45PLLOJQ . -- Xuebin Qin PhD Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

Thanks! I'll try several cascade models and give you corresponding results.

CeciliaPYY commented 3 years ago

sounds like use the light one as a refine module, which just like the residual refine module, isn't it ?

CeciliaPYY commented 3 years ago

Wondering about the "fix the heavy u2net", cause the model/train script you offered, the input shape is 288, but when setting the input as 512, can one get good result by fix a model training with 288 input size but use it as 512?