Open DonaldTsang opened 4 years ago
Good point : ) This is how the dataset was created actually. First use edges to separate out different regions, then manually correct the edges, because edge detection can be very inaccurate at times when the background color is similar to the character's color.
This might sound weird but that is exactly the questions I raised in another project https://github.com/KichangKim/DeepDanbooru/issues/5 How do you "manually correct" the data? And how does Google Recaptcha do it (if we need to resort to crowdsourcing)?
First I generated the edges based on the image. I wrote an angular based HTML UI to select the regions based on the edges (using the flood fill algorithm) and if I see any overflow, I modify the edge layer until there no longer is any overfill. Then I just save the masked regions as a separate image -- that serves as the segmentation ground truth. Does that make sense?
I would not say that I can follow completely... is flood fill similar to MSPaint's bucket tool, but instead of overwriting it is selecting regions? Or in other words, it is like the Magic Selection tool of Photoshop with heavy simplification? I that is exactly what you are doing, how can I replicate such a system at scale for a larger dataset?
Exactly like the Magic Selection tool. Libraries of efficient implementations can be found online pretty easily. For a larger dataset, from my experience you will not get high quality data from the untrained crowd. Segmentation is significantly more difficult than captcha . I'd suggest that 1. you get funding for it or work with a company 2. double and triple check if you REALLY need such a big dataset on the order of 10k or 100k labels. Are you doing things just for fun? Can you get away with data augmentation which is much simpler?
@jerryli27 unfortunately if we are doing pure image tagging without regions DeepDanbooru already does that with "questionable"/"great" results, but we want to do more than just that, and image segmentation might provide better insights as to how we can improve image tagging. The current dataset we are using is based on https://www.gwern.net/Danbooru2019 which has no segmentation, and based on the DD results maybe we can leverage it to ease in on generating segmented data.
It is kind of for fun, but I would really hope that this could be part of my future mental exercise.
Make sense : ) What application are you targeting with segmentation that you cannot do with tagging, if I may ask?
More like segmentation as a means to improve automated or machine-aided tagging through the use of Deep Learning. And discover patterns within the tagging knowledge graph itself through segmentation structures and overlaps.
Technology is kicking in fast https://github.com/KichangKim/DeepDanbooru/issues/5#issuecomment-820300209
This may sound weird, but is it possible to use edges of regions to redefine figure segmentation to make it more accurate?