IntegratedGradient with XLM type models (FlauBERT)

carodupdup commented 4 years ago

Hi there,

I am facing some issues trying to implement the IntegratedGradient algorithms with the pertained Flaubert Model I have made. First, I used the tutorial for the Bert for SQUAD and scrolling the issues, I came upon this gist: https://gist.github.com/davidefiocco/3e1a0ed030792230a33c726c61f6b3a5 that allowed me to apply the Flaubert Model by making only small changes.

However, this notebook allows me to use the LayerIntegratedGradient (LIG) algorithm but not the IntegratedGradient (IG). Therefore, going through the tutorial, I saw this paragraph explaining how to modify the algorithm used from LIG to IG: "we can also use IntegratedGradients class instead (of LayerIntegratedGradients), however in that case we need to precompute the embeddings and wrap Embedding layer with InterpretableEmbeddingBase module. This is necessary because we cannot perform input scaling and subtraction on the level of word/token indices and need access to the embedding layer."

I'm afraid that it confused me very much. Finally, looking at issue #150 (the code written by vfdev-5) is where I am blocked. First, I am wondering where the person uses the InterpretableEmbeddingBase module. Furthermore, I am trying to use the code, but the encoder, that it an object in the BERT models, is not one in XLM (at least from my knowledge). So I am wondering if it is possible to achieve the use the IntegratedGradient algorithm for a model such as an XLM.

If anyone has been working on this specific problem or is willing to help me, it would be much appreciated. Thank you in advance!

lipka-clazzpl commented 4 years ago

Hi carodupdup,

Layer named InterpretableEmbeddingBase is a proxy to the real embedding layer that You would like to plug Your explanation into. It's behaviour is simple, it mirrors the input on the output, hence, You need to precompute those word/position/token type embeddings. Bare in mind, no forward pass is made in the layer that is wrapped. Only than You would supply them instead of indices. Would guess that's the part that gets You confused, it confused me at first encounter ;)

Checkout BertEmbeddings in file modeling_bert.py (from transformers project), in it's forward function:

        if inputs_embeds is None:
            inputs_embeds = self.word_embeddings(input_ids)
        position_embeddings = self.position_embeddings(position_ids)
        token_type_embeddings = self.token_type_embeddings(token_type_ids)

        embeddings = inputs_embeds + position_embeddings + token_type_embeddings

Using InterpretableEmbeddingBase on word_embeddings, normally You would pass indices to get embeddings, but now You are supplying those embeddings ;)

Code written by @vfdev-5 is chronologicaly before this trick with input-output mirroring surfaced I guess.

I've actually used modified forward function from flair to explain word embedding, without InterpretableEmbeddingBase, since there was no easy way to access tokenizer (second one, flair can stack multiple embeddings, and tokenizes text twice).

carodupdup commented 4 years ago

Sorry it took so long for me to answer, I have been trying to figure it out since you answered but without managing. So, I will try to approach the problem differently, by asking you questions directly: -So, I recently learned about IG and LayerIG, so I am not yet accustomed to those methods. For me, the output of the method LayerIG is just the interpretation of a certain layer, apparently the "BertEmbedding layer", but the output of IG is the interpretation of each feature, right? And so, in the final visualisation (with the text and some green/red highlighted words), how is it possible to highlight words (e.g. find word importance) when we only found the importance of the neurons of one layer? -Secondly, I have used flair and managed to get the embeddings from this package, but it does not help me as to what to put into the IntegratedGradient class? It seems I have to create a wrapper, but I can't figure out what should be in this wrapper.

I am aware those questions might be on a beginners level, so I am sorry if I lack some kind of expertise.

Finally, to come back to your answer, I do not understand what do you mean by using the forward function to "explain" word embeddings? And so what ended up in your IntegratedGradient class?

lipka-clazzpl commented 4 years ago

I'll try to answer Your questions one by one

For me, the output of the method LayerIG is just the interpretation of a certain layer, apparently the "BertEmbedding layer", but the output of IG is the interpretation of each feature, right?

With LayerIntegratedGradients You could attribute to layer input or output (default is output, that is embeddings that come out of that computation node, You could call them features also, with indices_to_embeddings You control what is embedded, be it reference or text for explanation).

From tutorial

lig = LayerIntegratedGradients(squad_pos_forward_func, model.bert.embeddings)

As You can see, You still need to pass forward function, one reason is that under the hood, IntegratedGradients attribute method is being called.

Looking at forward in BertEmbeddings, during prediction You would have sum of inputs_embeds + position_embeddings + token_type_embeddings after normalization (dropout is relevant during training), it's hard to reconstruct anything from it granted You have access to output only.

If all You care about is the whole layer, go for LayerIntegratedGradients (less coding, same effect).

If You would like to attribute to sub-embeddings using IntegratedGradients, ie. position_embeddings pre-compute output of this layer w.r.t indices, that is needed, because input is scaled by alpha as below

        # scale features and compute gradients. (batch size is abbreviated as bsz)
        # scaled_features' dim -> (bsz * #steps x inputs[0].shape[1:], ...)
        scaled_features_tpl = tuple(
            torch.cat(
                [baseline + alpha * (input - baseline) for alpha in alphas], dim=0
            ).requires_grad_()
            for input, baseline in zip(inputs, baselines)
        )

Tutorial shows You calculations for all sub-embeddings, but You could be more specific and interpret only one of those sub-embeddings.
It's possible also to use another variant, LayerIntegratedGradients attribution to input on bert.embeddings.position_embeddings, it all depends what is Your main goal here.

lipka-clazzpl commented 4 years ago

And so, in the final visualisation (with the text and some green/red highlighted words), how is it possible to highlight words (e.g. find word importance) when we only found the importance of the neurons of one layer?

Input is forwarded for the whole network, not only one layer, meaning Captum would compute gradient of the output w.r.t inputs of the network forward function.

SQUAD tutorial note: For LIG attributions for start/end positions has shape 1x26x768, so You got embedding per token, it's used for visualizations later on, same is true for IG attributions, shape is 1x26x768, but this time for each sub-embedding (multiple inputs passed to attribute).

lipka-clazzpl commented 4 years ago

Finally, to come back to your answer, I do not understand what do you mean by using the forward function to "explain" word embeddings? And so what ended up in your IntegratedGradient class?

Had to refactor Flair's forward method in two parts, first compute embedding and than feed forward (using those embeddings)

robinvanschaik commented 3 years ago

Hi @lipka-clazzpl,

How did you manage to integrate Flair with the IntegratedGradient. Do you have a code snippet to help me out?

I have a fine-tuned a transformers-based classification model via Flair, and I am looking to explain what drives the predictions.

pytorch / captum

IntegratedGradient with XLM type models (FlauBERT) #414