Open elixir-code opened 3 years ago
However, if integrated gradients does not mandate that zero vector (or embedding of padding token) be used as reference token embedding, and allows the embedding of any random token can be used as reference, the above issue can be ignored.
Hi @elixir-code ,
Captum's LayerIntegratedGradients implementation allows you to define a custom baseline if a zero vector does not fit your problem (see baselines
in the arguments to the .attribute()
method ).
Check out this tutorial for an example of how baselines can be specified.
Hope this helps
Hi @elixir-code, Integrated gradients does not mandate zero vector as a reference / baseline. It can be anything of your choice. Good point regarding pad
. To be consistent, I'll make updates based on your suggestions.
📚 Documentation
Tutorial
https://captum.ai/tutorials/IMDB_TorchText_Interpret
Libraries used
Issue
The token for word
'pad'
is used instead of the special token'<pad>'
for padding sequences which have length less than minimum length and also as reference token in theTokenReferenceBase
object.Lines of the code with the issue
In the cell number 11 (In [11]):
In cell number 14 (In [14]):
Evidence to suggest that special token
'<pad>'
must be used instead of token'pad'
In cell 11 (In [11]), we want to find the index of the token used for padding. Currently, the index computed as
PAD_IND
is6978
:However, the index of token to be used padding is actually the index of token
'<pad>'
which is1
as can be inferred by running the following snippets of codes:In code snippet 11 (In [11]) from tutorial used for training CNN model:
In code snippet 5 (In [5]) from tutorial used for training CNN model:
Also, from following code snippets from the tutorial https://github.com/bentrevett/pytorch-sentiment-analysis/blob/master/4%20-%20Convolutional%20Sentiment%20Analysis.ipynb used for training the CNN models used in the tutorial with the issue, we can infer than the
'<pad>'
token instead of'pad'
token must be used.In tutorial used for training CNN model, cell 7 (In [7]):
Also, in the tutorial used for training CNN model, in cell 18 (In [18]), the
'<pad>'
token instead of'pad'
token is used for padding small sentences:Suggested changes in the tutorial:
In cell number 11 (In [11]), the changes to be made are:
In cell number 14 (In [14]), the changes to be made are: