Open mrektor opened 1 year ago
I'm also interested, I've the same issue
Here is a functioning version @mrektor @michelecafagna26
from captum.attr import InputXGradient
from transformers import pipeline
pipe = pipeline('text2text-generation',
model='google/flan-t5-base',
tokenizer='google/flan-t5-base',
device='cuda')
input_ids = pipe.tokenizer(["A simple example"], return_tensors="pt", padding=True, truncation=True).input_ids.to('cuda')
embedding_layer = pipe.model.base_model.encoder.embed_tokens
inputs_emb = embedding_layer(input_ids)
decoder_input_ids = pipe.tokenizer(["<pad>"] * input_ids.shape[0], return_tensors="pt", add_special_tokens=False, truncation=True).input_ids.to('cuda')
decoder_embedding_layer = pipe.model.base_model.decoder.embed_tokens
decoder_inputs_emb = decoder_embedding_layer(decoder_input_ids)
def forward_from_embeddings(inputs_embeds, decoder_inputs_embeds):
logits = pipe.model.forward(inputs_embeds=inputs_embeds, decoder_inputs_embeds=decoder_inputs_embeds)['logits'][:, -1, :]
return logits
lig = InputXGradient(forward_from_embeddings)
attributes_token_embedding = lig.attribute(
inputs=(inputs_emb, decoder_inputs_emb),
target=10
)
print(f'inputs_emb: {inputs_emb.shape}')
print(f'attributes_token_embedding: {[e.shape for e in attributes_token_embedding]}')
The code returns:
inputs_emb: torch.Size([1, 4, 768])
attributes_token_embedding: (torch.Size([1, 4, 768]), torch.Size([1, 1, 768]))
You might notice that forward_from_embeddings
now takes two parameters, but we're actually passing a tuple of tensors to it. This is because behind the scenes Captum matches tensors passed as inputs to the params of the attributed function that will need to be attributed.
If you don't want the sharp edges of Captum when attributing generative models from 🤗 Transformers, I suggest you to try our Inseq library. Here's an example to obtain the same results:
import inseq
model = inseq.load_model("google/flan-t5-base", "input_x_gradient")
# Attribute source and target prefix at every generation step
out = model.attribute("A simple example", attribute_target=True)
out
>>> FeatureAttributionOutput({
>>> sequence_attributions: list with 1 elements of type GranularFeatureAttributionSequenceOutput:[
>>> GranularFeatureAttributionSequenceOutput({
>>> source: list with 4 elements of type TokenWithId:[
>>> '▁A', '▁simple', '▁example', '</s>'
>>> ],
>>> target: list with 18 elements of type TokenWithId:[
>>> '▁A', '▁', 's', 'and', '▁castle', '▁is', '▁', 'a', '▁place', '▁where', '▁you', '▁can', '▁build', '▁', 'a', '▁castle', '.', '</s>'
>>> ],
>>> source_attributions: torch.float32 tensor of shape [4, 18, 768] on cpu,
>>> target_attributions: torch.float32 tensor of shape [18, 18, 768] on cpu,
>>> step_scores: {},
>>> sequence_scores: {},
>>> attr_pos_start: 0,
>>> attr_pos_end: 18,
>>> })
>>> ],
>>> ...
Hope it helps!
Thanks!! btw I didn't know about inseq, that library looks amazing!!! thanks for sharing
Issue
I am trying to use the Captum library to get attributions for my sequence-to-sequence T5 model. However the attributions are returned to the input token of the decoder rather than the input token of the encoder. This is not the desired outcome, as I would expect attributions for either both the encoder and the decoder, or just the encoder.
Minimal Reproduction Code
Here is the minimal code to reproduce the issue:
The code returns:
inputs_emb: torch.Size([1, **4**, 768])
attributes_token_embedding: torch.Size([1, **1**, 768])
As you can see, the second dimension of the
inputs_emb
andattributes_token_embedding tensors
are different. The desired outcome would be:attributes_token_embedding: torch.Size([1, **_4_**, 768])
Request for Help
I would appreciate any guidance on how to correctly obtain attributions for T5 models in Captum. Thank you for your assistance in resolving this issue.