laura-rieger / deep-explanation-penalization

Code for using CDEP from the paper "Interpretations are useful: penalizing explanations to align neural networks with prior knowledge" https://arxiv.org/abs/1909.13584
MIT License
127 stars 14 forks source link

How come Gradient Sum and EG do two gradient steps? #10

Closed henrikmarklund closed 3 years ago

henrikmarklund commented 3 years ago

Hi! Thanks for sharing this repo!

In the MNIST Decoy code, for method 1 and 2 (gradient_sum and eg), there are two gradient steps per batch. The first step uses gradients from just explanation_penalty, and the second step uses the gradients from both the explanation_penalty and the log loss. I was wondering what the reason for this was?

Reference in code: https://github.com/laura-rieger/deep-explanation-penalization/blob/8249af7fecf92c2b93dc2e39baf4cfd1423b53b4/mnist/DecoyMNIST/train_mnist_decoy.py#L168

Thanks!

laura-rieger commented 3 years ago

Hi Henrik, thank you for your interest. Both explanation methods utilize the gradients as an explanation, meaning that calculating the loss "normally" would equate to having to solve a differential equation (taking the gradient of the explanation loss, which is already a derivative plus the output loss). To circumvent this, we calculate the explanation loss, take the gradient and optimize and then do the same for the output.