Closed hadiasghari closed 7 months ago
Hi @hadiasghari, thank you for your interest in Inseq!
As you see from the example you posted, Inseq does work for the models you mention. In the example above you are computing integrated gradients for the two cases and using the pair aggregator to visualize the difference between the two, which may not be very informative, and contrastive attribution is more likely to give you a meaningful result in this case. Quoting from our tutorial:
While
PairAggregator
can be used to visualize the difference between two attribution outputs, using the difference in probability between an option A (e.g.he
in the previous example) and option B (e.g.she
) as a target for gradient-based attribution methods is a more principled way to obtain contrastive explanations answering the question "How is this feature X contributing to the prediction of A rather than B?".
However, we do not currently support integrated gradients for contrastive attribution, since the operation to expand steps for the contrastive target is currently not implemented. An example for your case in this setting would be:
from transformers import AutoModelForCausalLM, AutoTokenizer
import inseq
model_name = "meta-llama/Llama-2-7b-chat-hf"
access_token = "XXX" # needed because models are gated
hf_model = AutoModelForCausalLM.from_pretrained(model_name, token=access_token).cuda()
hf_tkz = AutoTokenizer.from_pretrained(model_name, token=access_token)
attrib_model = inseq.load_model(hf_model, "saliency", tokenizer=hf_tkz)
out = attrib_model.attribute(
input_texts="The manager went home because",
generated_texts="The manager went home because he was sick.",
contrast_targets="The manager went home because she was sick.",
attributed_fn="contrast_prob_diff",
step_scores=["probability", "contrast_prob_diff"],
)
out.show()
Let me know if this works for you!
Hi @hadiasghari, any update on this? Can I close the issue?
Dear @gsarti, thank you for your answer. I am not sure I fully understand the explanation as the same code works for the GPT2 model. But it's probably best that I follow this up on Discord and close the issue here.
Question
Hello, I was wondering if
inseq
works with either Mistral-7B or Llama-2-7B models? I can load the models without an issue and run the tutorial's minimum pair example ("The manager went home because..."). But the resulting saliency heatmap looks quite off (screenshot below).Additional context
I run the following code:
This results in the following saliency heatmap. The probabilities in the last row seem correct. But one would expect the cell for manager/he→she to be high/very red, which isn't the case.
Checklist
issues
.Thanks!