dbpprt / u-2-net-portrait

The code for our newly accepted paper in Pattern Recognition 2020: "U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection."
Apache License 2.0
117 stars 14 forks source link

big interesting on your loss #5

Open Sparknzz opened 3 years ago

Sparknzz commented 3 years ago

Hi thanks for the really good idea. Just want to know why do you choose l1 loss as your training loss instead of bce. did you do any experiments compared ce vs mse?

dbpprt commented 3 years ago

I did some preliminary testing and explored the effect of different loss functions. It is important to highlight that I'm using alpha mattes and no normal segmentation masks. L1 performed quite well, hence keeping it. However the explored differences were somewhat negligible if I remember correctly (I don't have the tensorboard logs available right now).

Sparknzz commented 3 years ago

thanks for the reply. As I am currently working on image editing plugin. I trained model based on your idea (using mse loss and supervisely dataset only, plus guassian blur on the segmentation edge.) But the performance is far away behind remove.bg website. I think the biggest part should be the lack of public dataset. Did you find any other good resources to improve the matting or segment performance. thanks in advance.

dbpprt commented 3 years ago

It is important to emphasize that I used a synthesized dataset which consists of higher quality alpha mattes. Datasets like the original supervise.ly person dataset do not contain fine detail segmentations.

dbpprt commented 3 years ago

Also I'm estimating trimaps using morphological operations (erosion & dilation) and calculate a l1 loss masking the estimated unknown area to assign different weights to this area of the image.

Sparknzz commented 3 years ago

totally got what you mean and very clear explanation, thanks. But if you play around the remove.bg, the result are quite better. I am looking forward to do something like that. The normal method (eg u2net deeplab ...) seems not a comprehensive proposal.

xuebinqin commented 3 years ago

To achieve results like remove.bg, larger and wider datasets have to be used and more engineering works have to be done. We can't expect that a model trained on a public dataset is able to compete with commercialized products. We are trying to improve the performance to get better results and our new dataset and model are on the way. Please keep following the updates. Thanks for your interests.

On Tue, Apr 27, 2021 at 7:48 AM nypzxy @.***> wrote:

totally got what you mean and very clear explanation, thanks. But if you play around the remove.bg, the result are quite better. I am looking forward to do something like that. The normal method (eg u2net deeplab ...) seems not a comprehensive proposal.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/dennisbappert/u-2-net-portrait/issues/5#issuecomment-827292388, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORMGOCQRTSAZ2GFR2J3TKYX2FANCNFSM43MORUKQ .

-- Xuebin Qin PhD Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

Sparknzz commented 3 years ago

@xuebinqin Hi qin, it's very glad to see you here. By the way the supervisely dataset only contains the human part, all accessories and belongings are excluded which is not good enough to do the remove.bg task. I am also creating our new dataset as well, will keep following your thread.

xuebinqin commented 3 years ago

Yes, true. The supervisely dataset is more accurate than COCO but still misses some structures and more details.

Glad to hear that you are creating new datasets. Yes, dataset contributes 80% of the model performance as Andrew NG just mentioned. We would like people to use the SOD models in wider binary-class segmentation tasks other than limited to SOD. We are also open to different ideas and suggestions. Please feel free to let us know if you have any of them. We will highly appreciate your interests and comments.

On Tue, Apr 27, 2021 at 1:47 PM nypzxy @.***> wrote:

@xuebinqin https://github.com/xuebinqin Hi qin, it's very glad to see you here. By the way the supervisely dataset only contains the human part, all accessories and belongings are excluded which is not good enough to do the remove.bg task. I am also creating our new dataset as well, will keep following your thread.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dennisbappert/u-2-net-portrait/issues/5#issuecomment-827473460, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORJWPCCYPHYPKILZCJDTK2B3XANCNFSM43MORUKQ .

-- Xuebin Qin PhD Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

Sparknzz commented 3 years ago

To achieve results like remove.bg, larger and wider datasets have to be used and more engineering works have to be done. We can't expect that a model trained on a public dataset is able to compete with commercialized products. We are trying to improve the performance to get better results and our new dataset and model are on the way. Please keep following the updates. Thanks for your interests. On Tue, Apr 27, 2021 at 7:48 AM nypzxy @.***> wrote: totally got what you mean and very clear explanation, thanks. But if you play around the remove.bg, the result are quite better. I am looking forward to do something like that. The normal method (eg u2net deeplab ...) seems not a comprehensive proposal. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#5 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORMGOCQRTSAZ2GFR2J3TKYX2FANCNFSM43MORUKQ . -- Xuebin Qin PhD Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

Hi as you said "To achieve results like remove.bg, larger and wider datasets have to be used and more engineering works have to be done. ", May I ask what kind of engineering works are you mentioned here?