marcoancona / DeepExplain

A unified framework of perturbation and gradient-based attribution methods for Deep Neural Networks interpretability. DeepExplain also includes support for Shapley Values sampling. (ICLR 2018)
https://arxiv.org/abs/1711.06104
MIT License
720 stars 133 forks source link

How to generate baseline for deeplift? #54

Closed durrantmm closed 4 years ago

durrantmm commented 4 years ago

Hello, I would just like a more detailed description of the baseline parameter when using the deeplift model. Is this something that I need to generate myself using shuffled sequences as a reference? Do you provide functions for doing this?

AvantiShri commented 4 years ago

Hi @durrantmm, I just saw this by chance. It sounds like you are using genomic sequences as your input since you are talking about shuffled sequences, in which case Marco wouldn’t be the right person to ask this question to (Marco’s DeepExplain provides a unified implementation of various interpretability algorithms, but he is not the point person for domain-specific usage of individual algorithms). The short answer to your question is yes, you have to supply the reference yourself, and if you are planning to use multiple reference per example then the DeepExplain implementation isn’t well suited to this; instead, you should either use the original DeepLIFT implementation , or (if the original DeepLIFT implementation does not work for your architecture), you should use the DeepSHAP implementation (I discuss this in an FAQ on the DeepLIFT repo - DeepSHAP is an extension of DeepLIFT: https://github.com/kundajelab/deeplift#what-are-the-similarities-and-differences-between-the-deeplift-like-implementations-in-deepexplain-from-ancona-et-al-iclr-2018-and-deepshapdeepexplainer-from-the-shap-repository).

For an example notebook that uses DeepSHAP and multiple shuffled sequences as the reference, you can refer to this: https://gist.github.com/AvantiShri/8a3a0a03f4c4a578ee7909e3989467cc

Hope this helps. Feel free to direct other domain-specific questions on DeepLIFT at me.

marcoancona commented 4 years ago

Thanks @AvantiShri for answering this in detail. Indeed DeepExplain only supports a fixed baseline (usually the zero or mean baseline). Notice that there is still an open discussion about which reference baseline should be used (e.g. https://arxiv.org/abs/1908.08474).

durrantmm commented 4 years ago

Great, thank you very much Avanti!