pytorch / captum

Model interpretability and understanding for PyTorch
https://captum.ai
BSD 3-Clause "New" or "Revised" License
4.95k stars 499 forks source link

"The attention mask and the pad token id were not set" during attribution #1449

Open RiverGao opened 4 days ago

RiverGao commented 4 days ago

🐛 Bug

To Reproduce

Steps to reproduce the behavior:

  1. Use inp = TextTemplateInput(template=attr_template, values=input_sentences) to create an input
  2. Use attr_res = llm_attr.attribute(inp, skip_tokens=[tokenizer.bos_token_id]) to do the attribution
  3. Observe multiple times of warnings:
    Setting `pad_token_id` to `eos_token_id`:None for open-end generation.
    The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.

Expected behavior

No warning about attention_mask

Environment

Describe the environment used for Captum



 - Captum / PyTorch Version (0.7.0 / 2.4.0):
 - OS (Ubuntu 22.04):
 - How you installed Captum / PyTorch (source, conda):
 - Build command you used (if compiling from source): `pip install -e .`
 - Python version: 3.10.15
 - CUDA/cuDNN version: 12.1
 - GPU models and configuration: H800
 - Any other relevant information:

## Additional context

<!-- Add any other context about the problem here. -->