Closed szamani20 closed 3 years ago
@szamani20 Hi, actually the attributions are calculated firstly applying the Integrated Gradients (https://arxiv.org/abs/1703.01365) with respect to the model's embeddings and then (as IG will give as the matrix of shape [batch_size, seq_length, hidden_dim]
) we sum up attributions by 1 axis (to get single scalar for each token) and normalize them.
Thanks for helping out @koren-v , the original paper is a very good to help understand how the algorithm works, the rest like @koren-v mentioned is just summing and normalizing to produce single scalars.
Thank you for your amazing work. The documentation for this project appears to be limited to code usage. I couldn't find much explanation for the actual method used to explain the model. Explicitly, some comments for the
_calculate_attributions()
method would be helpful to give an idea on how attributions are calculated. Thanks!