Closed openerror closed 5 years ago
Hey there, that's an important question!
For human observers, the paradigm was "forced choice"---i.e., observers had to select one of 16 entry-level categories (like dog, airplane, chair, etc.). In order to make the human-CNN-comparison as fair as possible, we also designed a forced choice experiment for CNNs: that is, we discarded all predictions for categories that were not part of those 16 categories. (Otherwise, it would not be possible to compare CNN responses to humans if CNNs were allowed to select arbitrary categories: for practical reasons, we can't ask humans to classify among 1,000 classes).
You can find the mapping of ImageNet classes to 16-class-ImageNet classes here. Among the predictions for these classes, we looked at the argmax and that was counted as the classification decision.
Does that answer your question?
Hello Robert! Your comments are very helpful; thank you! I will let you know if I run into any issues reproducing the paper's results.
I'm closing this issue for now, please feel free to re-open if any issue appears!
Hello there! This may be an elementary question --- but what is the mapping between the model predictions and the ImageNet class names? Asking on behalf of a cognitive science group that's not as familiar with deep learning as you are :)
( Yes, I understand that one would take the argmax of the logits, and call that the predicted class. However, running the models on a CPU-only machine, I could not get predictions that line up with the image's content --- texture-wise or shape-wise. Hence I'm gathering info before another try )