Open soskek opened 5 years ago
@soskek Thanks for your interests. The general process of the mapping is: 1) Remove some rare tags, i.e., the number of corresponding images is very small; remove some visually vague tags (manually), such as "event, summer" 2) Search each remained tag from ImageNet or Open Images in WordNet, and obtain their ID (like hair-n01900150). However, this step is very time-consuming, as there may be multiple meanings/IDs for one tag. You have to check the corresponding images and pick one ID for this tag. 3) According to the obtained ID, construct the semantic hierarchy, and merge synonymous IDs into one unique ID.
Generally, this process is very time-consuming. You can find the mapping between the ID and the tag from the file "data/dictionary_and_semantic_hierarchy.txt". Hope it helps.
I see. Then, the raw original tags in Open Images or ImageNet are written in the column "category name" of the corresponding row in the file. Thank you for the quick response!
I'm still confused about how to read actual alignment from OpenImages to this dataset (or WordNet synset) from the mapping file, "data/dictionary_and_semantic_hierarchy.txt".
We can see OpenImages labels in https://storage.googleapis.com/openimages/v5/class-descriptions.csv
For example, OpenImages has category /m/052sf,Mushroom
.
Then, the category name in OpenImages should be mushroom
(we have to lowercase many categories). After that, we can see lines with mushroom
strings in the mapping file as follows:
118 n07734744 34 mushroom
822 n07734879 792 stuffed mushroom
8265 n01917882 8262 mushroom coral
9208 n13001930 5178 shiitake, shiitake mushroom, Chinese black mushroom, golden oak mushroom, Oriental black mushroom, Lentinus edodes
9245 n13049953 9232 polypore, pore fungus, pore mushroom
9247 n12997919 9232 mushroom
9251 n13001041 9246 mushroom
9252 n13005984 9246 inky cap, inky-cap mushroom, Coprinus atramentarius
9253 n13000891 9246 mushroom
Even with the exact match, we have more than one lines; 118, 9245, 9251, and 9253. In such cases, this is an ambiguous multi-label example? (No complete mapping exists and, if we want, should we directly refer to a human-validated file like train_urls_from_openimages.txt
?)
And, as the second question, if we can see /m/01h44,Bat (Animal)
in the OpenImages reference https://storage.googleapis.com/openimages/v5/class-descriptions.csv
But, it cannot be matched with any lines in "data/dictionary_and_semantic_hierarchy.txt" (while it has 2353 n02806379 2344 bat
), due to its "(...)". Can we know this kind of normalization which was used?
Thank you for the great work! The paper says
How did you make the mapping? And, is the mapping list available in this repository?