xuebinqin / U-2-Net

The code for our newly accepted paper in Pattern Recognition 2020: "U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection."
Apache License 2.0
8.59k stars 1.48k forks source link

How to capture more fine details #47

Open yxt132 opened 4 years ago

yxt132 commented 4 years ago

First of all, great work! The model is able to capture some fine details such as human hair. However, it is not capable of capturing the majority of the finer details such as little holes and hair. I tried to add the iou loss function and the model becomes very confident on predicting the object edges. However, the little holes/gaps in human hair are not captured and a lumped area with solid foreground will be predicted.

Do you have any suggestion as to modifying the model to capture more fine details of the objects?

bluesky314 commented 4 years ago

iou loss does not make edges sharper, but encourages masks to be smooth and helps will imbalance. I am also working on the same problem where my edges are very sharp like hairs, holes. I havent tried but you could maybe try hard negative mining. I found increasing the number of such examples or oversampling with training for very long helped. Let me know if you want to discuss further.

yxt132 commented 4 years ago

thanks for your response @bluesky314 . i guess we could improve the results for fine details if we use larger input size but that will require more GPU memory. Maybe we could use more efficient conv kernels.

bluesky314 commented 4 years ago

What kind of efficient kernals do you have in mind?

CeciliaPYY commented 3 years ago

thanks for your response @bluesky314 . i guess we could improve the results for fine details if we use larger input size but that will require more GPU memory. Maybe we could use more efficient conv kernels.

Have you guys tried large input size yet? Cause I'm dealing with high resolution images, which by no means, if use 320 as input size, the final result will got jagged edges cause of resizing back. So I'm wondering if larger input size could help, and also bring more contributes to other problems.

xuebinqin commented 3 years ago

directly using large input size (e.g. 640x640) won't improve (may degrade) the results. you can try U2NET+cascadedPSP or U2Net + PyMatting used in the python lib https://pypi.org/project/rembg/.

On Tue, Jan 26, 2021 at 4:26 PM Cecilia notifications@github.com wrote:

thanks for your response @bluesky314 https://github.com/bluesky314 . i guess we could improve the results for fine details if we use larger input size but that will require more GPU memory. Maybe we could use more efficient conv kernels.

Have you guys tried large input size yet? Cause I'm dealing with high resolution images, which by no means, if use 320 as input size, the final result will got jagged edges cause of resizing back. So I'm wondering if larger input size could help, and also bring more contributes to other problems.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/NathanUA/U-2-Net/issues/47#issuecomment-767896651, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORKDWC4EJ6XGJDIJBTLS35FTXANCNFSM4OUC7EVQ .

-- Xuebin Qin PhD Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/