Addressing issues regarding this system

DonaldTsang commented 4 years ago

It would be easy to break such a system and cause mis-tagging by using these libraries as a demonstration on the weakness of using NN for automated image tagging.

Here are some of my proposals to make it more resilient:

image augmentation to prevent overfitting
usage of multiple models for the same task https://arxiv.org/abs/1809.00065 and maybe add Inception or others to the system
de-noising the image using something like

AND THEN THERE IS THIS (claiming that most mitigation strategy fails) https://github.com/anishathalye/obfuscated-gradients

I have talked about something similar in https://github.com/halcy/DeepDanbooruActivationMaps/issues/3

DonaldTsang commented 4 years ago

Some useful information regarding the semantic segmentation of images https://github.com/mrgloom/awesome-semantic-segmentation Weird problems that will arise from using the repos within verbatim:

How do we deal with tag synonyms and tag subsets? Do we create a system of which segmented regions can have multiple tags?
What about character tags vs facial/clothing component tags? How do we correlate them together into a logical manner? hierarchies?
What about segmented regions that are too small? Would it get picked up by DD 1.0 but not DD+SS system?
How many layers do we need maximum? 32? (since that is the maximum amount of tags per image in general?) 64/128/256?

DonaldTsang commented 4 years ago

Some ideas in how to implement a Semantic Segmentation dataset/model "ShoujoSegment"

The initial dataset phase
- Gather a list of images with strong heatmap confidence
- Use Recaptha's 3x3 5F-3T-1U test to refine the borders (remember to augment and noise them)
- Collect results from volunteers and address weighting and credibility issues
The Semantic Segmentation training phase
- Create the system model (or better yet multiple models)
- Use the collected data to train the system
- Optimize the system speed and accuracy wise regarding ensembles
The data refinement phase
- Increase the scope of images used
- Use Recaptha's 3x3 5F-3T-1U test to refine the borders (remember to augment and noise them)
- Use volunteer's results to refine the Semantic Segmentation
Others things that can be done outside of this loop
- Create micro-models (that is a simplified version of the main model) for mobile systems
- Apply this system into a new social media network for community contributions
- Use the "ShoujoSegment" system to refine DeepDanbooru and vice versa

This concept would be applied as the "Humans in the Loop"or "Active Learning" system. A good example would be:

If there are crowdsourced Semantic Segmentation this can help http://ilpubs.stanford.edu:8090/1161/1/main.pdf and http://ceur-ws.org/Vol-2173/paper10.pdf

DonaldTsang commented 3 years ago

I am just going to put this here, for those who wants to go from label to table.

KichangKim / DeepDanbooru

Addressing issues regarding this system #5