pytorch / captum

Model interpretability and understanding for PyTorch
https://captum.ai
BSD 3-Clause "New" or "Revised" License
4.85k stars 489 forks source link

Integrated Gradients RuntimeError: The expanded size of the tensor must match the existing size #787

Open RylanSchaeffer opened 3 years ago

RylanSchaeffer commented 3 years ago

Following the "tutorial" in another issue (https://github.com/pytorch/captum/issues/150#issuecomment-549022512), I try the exact same sequence of function calls.

        sentence = 'explorer: morseth have whiskers\nstudent: what color?\nexplorer: they can be any color, as long as they have whiskers\nstudent: do they have fur?\nexplorer: no\nstudent: do they live in the water?\nexplorer: yes'
        input_ids = torch.tensor([
            tokenizer.encode(role_texts[row_idx], add_special_tokens=True)])
        input_ids = input_ids.to(curr_device)
        input_embedding = forward_func.model.base_model.embeddings(
            input_ids)
        attributions, delta = ig.attribute(
            inputs=input_embedding,
            return_convergence_delta=True)

But I get the following runtime error:

RuntimeError: The expanded size of the tensor (768) must match the existing size (53) at non-singleton dimension 2.  Target sizes: [50, 53, 768].  Tensor sizes: [1, 53]
RylanSchaeffer commented 3 years ago

The problematic line appears to come from transformers.models.distilbert.modeling_distilbert.py:

position_ids = position_ids.unsqueeze(0).expand_as(input_ids)  # (bs, max_seq_length)

positions_ids has length equal to the sequence length, but input_ids has shape (batch size, sequence length, embedding dimension).