How are the tags encoded for the training?

Super sorry for the late reply. The data preparation is done in the training_notebooks/data_preparation.ipynb notebook, which looks through the raw danbooru2018 dataset's file structure with metadata to build a final csv file which is used for training. Concretely, it takes the metadata tags for a particular image, filters it to those tags within the top 6000 most seen tags over the full dataset, and then adds a tag for the age_rating and score attribute in the metadata for that image.

This list of tags is then saved with the file path of the image, in the format of [image path], [list of space-separated tags]. For example:

danbooru2018/original/0167/263167.jpg,age_rating_s 1girl solo long_hair brown_hair ribbon bangs meta_score_0 yellow_eyes japanese_clothes barefoot artist_request blunt_bangs hair_bun hime_cut eyes ankle_ribbon'

This process is repeated for each image in the Danbooru dataset, and each line as generated above is saved into a final tag_labels_6000.csv file. This file is then directly fed into fastai. Internally fastai I believe does hot encode them (with a 6000 length zero vector with ones where that tag is present for that image), however the library's implementation changes quite often so it is best to consult the docs.

So in short: no, they are not hot encoded (although fastai does eventually hot encode them internally for using it in the loss function I believe). Feel free to compare with the tag_labels_6000.csv file generated from my data preparation to double-check that they are the same.

RF5 / danbooru-pretrained

How are the tags encoded for the training? #2