rgeirhos / texture-vs-shape

Pre-trained models, data, code & materials from the paper "ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness" (ICLR 2019 Oral)
https://openreview.net/forum?id=Bygh9j09KX
Other
785 stars 101 forks source link

Same shape & texture category #6

Closed eminorhan closed 5 years ago

eminorhan commented 5 years ago

Thanks for releasing the stimuli! I noticed that some of the cue conflict images have the same shape and texture category. This suggests that the fraction of texture and shape decisions should not add up to 1, but the figures in the paper suggest otherwise. So, how do you handle those cases? Wouldn't it be better to entirely exclude such cases from the experiments?

Also, for some high-performing ImageNet models, I get pretty low overall accuracies (either shape or texture matches) on the cue conflict images: ~25-30% (these models all have over 80% top-1 accuracy on ImageNet). I know that you touch upon this in the paper, but I don't think you report those numbers. Just looking at the barplots in the paper, ~25-30% seemed too low to me, but I was just wondering if you could confirm whether 25-30% is indeed too low.

rgeirhos commented 5 years ago

Concerning question 1, this is copied from the paper: "For our analysis of texture vs. shape biases (Figure 4), we excluded trials for which no cue conflict was present (i.e., those trials where a bicycle content image was fused with a bicycle texture image, hence no texture-shape cue conflict present)."

Concerning question 2: you would need to analyse the data from https://github.com/rgeirhos/texture-vs-shape/tree/master/raw-data/style-transfer-512-nomask-experiment to get the exact numbers. I share your intuition that <30% seems to be a bit low from looking at the bar plots of Figure 4.

eminorhan commented 5 years ago

Ah, thanks! I should have read it more carefully.

rgeirhos commented 5 years ago

No worries! There's a lot of text in the appendix ...