Open carodupdup opened 4 years ago
Hi carodupdup,
Layer named InterpretableEmbeddingBase
is a proxy to the real embedding layer that You would like to plug Your explanation into.
It's behaviour is simple, it mirrors the input on the output, hence, You need to precompute those word/position/token type embeddings. Bare in mind, no forward pass is made in the layer that is wrapped.
Only than You would supply them instead of indices. Would guess that's the part that gets You confused, it confused me at first encounter ;)
Checkout BertEmbeddings
in file modeling_bert.py
(from transformers project), in it's forward
function:
if inputs_embeds is None:
inputs_embeds = self.word_embeddings(input_ids)
position_embeddings = self.position_embeddings(position_ids)
token_type_embeddings = self.token_type_embeddings(token_type_ids)
embeddings = inputs_embeds + position_embeddings + token_type_embeddings
Using InterpretableEmbeddingBase
on word_embeddings, normally You would pass indices to get embeddings, but now You are supplying those embeddings ;)
Code written by @vfdev-5 is chronologicaly before this trick with input-output mirroring surfaced I guess.
I've actually used modified forward
function from flair to explain word embedding, without InterpretableEmbeddingBase
, since there was no easy way to access tokenizer (second one, flair can stack multiple embeddings, and tokenizes text twice).
Sorry it took so long for me to answer, I have been trying to figure it out since you answered but without managing. So, I will try to approach the problem differently, by asking you questions directly: -So, I recently learned about IG and LayerIG, so I am not yet accustomed to those methods. For me, the output of the method LayerIG is just the interpretation of a certain layer, apparently the "BertEmbedding layer", but the output of IG is the interpretation of each feature, right? And so, in the final visualisation (with the text and some green/red highlighted words), how is it possible to highlight words (e.g. find word importance) when we only found the importance of the neurons of one layer? -Secondly, I have used flair and managed to get the embeddings from this package, but it does not help me as to what to put into the IntegratedGradient class? It seems I have to create a wrapper, but I can't figure out what should be in this wrapper.
I am aware those questions might be on a beginners level, so I am sorry if I lack some kind of expertise.
Finally, to come back to your answer, I do not understand what do you mean by using the forward
function to "explain" word embeddings? And so what ended up in your IntegratedGradient
class?
I'll try to answer Your questions one by one
For me, the output of the method LayerIG is just the interpretation of a certain layer, apparently the "BertEmbedding layer", but the output of IG is the interpretation of each feature, right?
With LayerIntegratedGradients
You could attribute to layer input or output (default is output, that is embeddings that come out of that computation node, You could call them features also, with indices_to_embeddings
You control what is embedded, be it reference or text for explanation).
From tutorial
lig = LayerIntegratedGradients(squad_pos_forward_func, model.bert.embeddings)
As You can see, You still need to pass forward function, one reason is that under the hood, IntegratedGradients
attribute method is being called.
Looking at forward
in BertEmbeddings
, during prediction You would have sum of inputs_embeds + position_embeddings + token_type_embeddings
after normalization (dropout is relevant during training), it's hard to reconstruct anything from it granted You have access to output only.
If all You care about is the whole layer, go for LayerIntegratedGradients
(less coding, same effect).
If You would like to attribute to sub-embeddings using IntegratedGradients
, ie. position_embeddings
pre-compute output of this layer w.r.t indices, that is needed, because input is scaled by alpha as below
# scale features and compute gradients. (batch size is abbreviated as bsz)
# scaled_features' dim -> (bsz * #steps x inputs[0].shape[1:], ...)
scaled_features_tpl = tuple(
torch.cat(
[baseline + alpha * (input - baseline) for alpha in alphas], dim=0
).requires_grad_()
for input, baseline in zip(inputs, baselines)
)
Tutorial shows You calculations for all sub-embeddings, but You could be more specific and interpret only one of those sub-embeddings.
It's possible also to use another variant, LayerIntegratedGradients
attribution to input on bert.embeddings.position_embeddings
, it all depends what is Your main goal here.
And so, in the final visualisation (with the text and some green/red highlighted words), how is it possible to highlight words (e.g. find word importance) when we only found the importance of the neurons of one layer?
Input is forwarded for the whole network, not only one layer, meaning Captum would compute gradient of the output w.r.t inputs of the network forward function.
SQUAD tutorial note:
For LIG attributions for start/end positions has shape 1x26x768, so You got embedding per token, it's used for visualizations later on, same is true for IG attributions, shape is 1x26x768, but this time for each sub-embedding (multiple inputs passed to attribute
).
Finally, to come back to your answer, I do not understand what do you mean by using the forward function to "explain" word embeddings? And so what ended up in your IntegratedGradient class?
Had to refactor Flair's forward
method in two parts, first compute embedding and than feed forward (using those embeddings)
Hi @lipka-clazzpl,
How did you manage to integrate Flair with the IntegratedGradient. Do you have a code snippet to help me out?
I have a fine-tuned a transformers-based classification model via Flair, and I am looking to explain what drives the predictions.
Hi there,
I am facing some issues trying to implement the IntegratedGradient algorithms with the pertained Flaubert Model I have made. First, I used the tutorial for the Bert for SQUAD and scrolling the issues, I came upon this gist: https://gist.github.com/davidefiocco/3e1a0ed030792230a33c726c61f6b3a5 that allowed me to apply the Flaubert Model by making only small changes.
However, this notebook allows me to use the LayerIntegratedGradient (LIG) algorithm but not the IntegratedGradient (IG). Therefore, going through the tutorial, I saw this paragraph explaining how to modify the algorithm used from LIG to IG: "we can also use IntegratedGradients class instead (of LayerIntegratedGradients), however in that case we need to precompute the embeddings and wrap Embedding layer with InterpretableEmbeddingBase module. This is necessary because we cannot perform input scaling and subtraction on the level of word/token indices and need access to the embedding layer."
I'm afraid that it confused me very much. Finally, looking at issue #150 (the code written by vfdev-5) is where I am blocked. First, I am wondering where the person uses the InterpretableEmbeddingBase module. Furthermore, I am trying to use the code, but the encoder, that it an object in the BERT models, is not one in XLM (at least from my knowledge). So I am wondering if it is possible to achieve the use the IntegratedGradient algorithm for a model such as an XLM.
If anyone has been working on this specific problem or is willing to help me, it would be much appreciated. Thank you in advance!