Closed wangyongjie-ntu closed 4 years ago
Hi,
That is indeed true. As mentioned in the Readme, FullGrad saliency is specific to CNNs, while the full-gradient decomposition applies to any ReLU network. However, this decomposition cannot be used as-is to obtain importance scores.
However, if your ReLU MLP has no bias units or batchnorm, then the full-gradient decomposition reduces to input-gradients \times input, which is a feature importance score.
Hope that helps.
@suraj-srinivas Thanks very much for your reply.
I have a tabular dataset and implement an MLP, and want to analyze the importance score upon each feature. I am afraid that fullgrad cannot work on MLP because upsampling bias-grad to the input shape seems wired? Is it true? Thanks very much.