Explanation methods fail to show convincing results on stylised Imagenet models

rgeirhos / texture-vs-shape

Pre-trained models, data, code & materials from the paper "ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness" (ICLR 2019 Oral)

https://openreview.net/forum?id=Bygh9j09KX

Other

785 stars 101 forks source link

Explanation methods fail to show convincing results on stylised Imagenet models #12

Closed nareshshah139 closed 4 years ago

nareshshah139 commented 4 years ago

https://nbviewer.jupyter.org/github/UntangleAI/example/blob/master/stylized_imagenet_vis_check_alexnet.ipynb

If you still want to check it out.

Would appreciate insights as to why stylised imagenet models fail to provide explanations which should be 'sparse' or 'edge focused' and instead provide very poor explanations overall.

nareshshah139 commented 4 years ago

Not sure if there is clear evidence based on any explanation technique that stylised imagenet models actually enforce a shape bias.

rgeirhos commented 4 years ago

I wouldn't expect SIN-trained networks to show a perfect behaviour here; but have you checked whether there are any differences between IN and SIN-trained models? Also, please note that some of the saliency methods you're using are essentially unrelated to network decision making (DeconvNet, GBP), cf. Nie et al. (2018), https://arxiv.org/pdf/1805.07039.pdf and Adebayo et al. (2018), http://papers.nips.cc/paper/8160-sanity-checks-for-saliency-maps.pdf

nareshshah139 commented 4 years ago

https://github.com/UntangleAI/example/blob/master/imagenet_vis_check_alexnet_Imagenet_trained.ipynb

For comparison with standard Alexnet models trained on imagenet. We've got similar examples for the Resnet and other architectures available on torch vision as well.

We've also managed to resolve the issues mentioned in Adebayo et al. (2018), http://papers.nips.cc/paper/8160-sanity-checks-for-saliency-maps.pdf by using contrastive methods (seen as 'difference heatmaps and inverse difference heat maps'. The issues were caused by shared features amongst multiple classes.

Here's a paper which shares a similar methodology. https://arxiv.org/pdf/1905.12152.pdf

Will be happy to share our toolkit as well as results based on difference/inverse difference heatmaps as well.

rgeirhos commented 4 years ago

Interesting, thanks for sharing! Good to hear that you're not using the vanilla methods.

nareshshah139 commented 4 years ago

Also have tested this using PatternNet and PatternAttribution which is a data and model dependent explanation technique(which passes sanity checks without modifications) and found that stylised imagenet models do not have a shape bias. Those are also parts of our toolkit.

nareshshah139 commented 4 years ago

Is this something you want to explore further?

rgeirhos commented 4 years ago

Explore further?

nareshshah139 commented 4 years ago

To build methods in order enforce shape bias into neural networks while being able to validate those methods using explainable ai techniques.

rgeirhos commented 4 years ago

I see what you mean, and I certainly think that this would be an interesting idea! Unfortunately I don't have the capacities to contribute to this project I'm afraid.