Closed lilyjiayi closed 1 year ago
Hi there! I have tried to pair the "in_5k_label_id" in geo-yfcc to the first 5K wordnet lemmas in the imagenet-21k (https://storage.googleapis.com/bit_models/imagenet21k_wordnet_lemmas.txt). It seemed that something is going wrong. For example, for the image with yfcc_row_id: 76746513, in_5k_label_ids: 2037, its corresponding label is "sea_swallow, Sterna_hirundo" (line 2037 in the word lemma file, indexed from 0) which does not seem to describe the image.
In addition, I looked into the "in_5k_label_ids" column in the metadata file and it includes labels starting from 1000. I assume that the first 1000 ids correspond to those included in the ImageNet LSVRC12. However, the first 1000 ids in imagenet-22k (https://storage.googleapis.com/bit_models/imagenet21k_wordnet_ids.txt) are not those of imagenet1k (https://gist.github.com/fnielsen/4a5c94eaa6dcdf29b7a62d886f540372).
Could it be possible that you have used another ordering of the lemmas? Do you still have access to that? Thanks!
Hi, I encountered the same problem with not being able to map the label ids to the label names in ImageNet5k. Any solution to this? Thank you!
Apologies for the delay folks, I have updated the repo with a JSON file with correct synset ID mappings. Please let me know in a new issue if the problem persists.
Hi! Thanks for creating the GeoYFCC dataset! I am trying to do analysis over how well labels describe actual content of the images and realize that there is no text label (only label id) in the metadata. Do you have corresponding texts or wordnet ids for all the classes? That will help a lot. Thank you!