danbooru / danbooru

A taggable image board written in Rails.
Other
2.26k stars 416 forks source link

Automated tagging #3021

Closed r888888888 closed 6 years ago

r888888888 commented 7 years ago

A new feature to analyze an image and provide a list of suggested tags based on the visual qualities of the image, similar to the classification efforts in ImageNet.

Researchers and hobbyists have approached me in the past about using the Danbooru database for such projects. You can find a fairly good implementation at illustration2vec. It works surprisingly well. Although they use Caffe.

The biggest obstacle is taking a system and making it online (that is, constantly update its database with new content).

I think taking Google's existing Inception framework and retraining the final layer may produce adequate results. Tweaking the models is kind of a black art and it's useful to compare any changes with a baseline.

I will have a dedicated GPU server with a GTX 1080 for training purposes. I could conceivably add another if this turns out to be a bottleneck.

I am creating this ticket to help aggregate any information.

Type-kun commented 7 years ago

Researchers and hobbyists have approached me in the past about using the Danbooru database for such projects. You can find a fairly good implementation at illustration2vec. It works surprisingly well. Although they use Caffe.

Well, holy crap. I uploaded the image which IQDB was unable to find, and about 15 of 20 general tags were correct (though it didn't guess the character or copyright, or I would be forced to believe that future has actually come). Maybe their implementation can be moved in-house as a service, similar to how we incorporated IQDB?

The general problem is that such a system needs an actual image file to work, and at upload time we don't have the file itself, whether source link was specified or file was uploaded directly, so that feature is unlikely to work at upload time where it's needed the most.

kittey commented 7 years ago

The general problem is that such a system needs an actual image file to work, and at upload time we don't have the file itself, whether source link was specified or file was uploaded directly, so that feature is unlikely to work at upload time where it's needed the most.

Doesn’t the “find similar” feature already need the file? If that works, just pass the already present file to the tag-suggester. Basically just make the pre-upload step (“find similar” and tag suggestion) mandatory. Users aren’t supposed to upload anything without using “find similar” first anyway, right?

I’m amazed that researchers and hobbyist are interested in Danbooru despite its infamously high amount of explicit images, to phrase it lightly. Doesn’t that attest that we’re one of the best-tagged image galleries? =D

r888888888 commented 7 years ago

It'd be tricky to just reuse illust2vec because their database is static (there's no process for adding new tags and pruning old ones). It's also based on the Caffe framework which, while extremely good, I don't think compares to Tensorflow in terms of popularity and active research. There's a lot of potential beyond just classification. It's conceivable to come up with image captioning, reporting on common elements that don't exist as tags yet, as well as more esoteric applications like style transfer and image generation.

Downloading the file isn't a huge problem. We already do it to an extent for IQDB. Classifying an image isn't intensive, it's the training that's computationally expensive.

r888888888 commented 7 years ago

I finally have a working proof of concept at https://benten.donmai.us/query

The list of character tags it's been trained on. It's weighted towards the most recent six months so some tags may be underrepresented.

r888888888 commented 7 years ago

CCS integration is in #3228 and deployed on testbooru. It works with the bookmarklet. Seems to work fairly well.

evazion commented 7 years ago

It's conceivable to come up with image captioning, reporting on common elements that don't exist as tags yet, as well as more esoteric applications like style transfer and image generation.

Here's a tool that does automatic character generation: http://make.girls.moe/. They cite Danbooru in their technical report: http://make.girls.moe/technical_report.pdf. Not really useful for anything, but interesting nonetheless.

r888888888 commented 7 years ago

I noticed that. A neat side project would be some sort of app to facilitate isolating features by clicking and dragging. A big problem with training is the current tag system isn't really optimized for training on single features. A post might be tagged with two characters even though one of them isn't a significant component of the image.