TypeError: llama_pos_shift_attention_forward() got an unexpected keyword argument 'padding_mask'

MartinKratochvilProgramy commented 1 year ago

I'm trying to reproduce the work per readme but am getting "TypeError: llama_pos_shift_attention_forward() got an unexpected keyword argument 'padding_mask'" on line 103 in run_streaming_llama.py. I have no idea how to debug this as there is no padding_mask in the code.

tomaarsen commented 1 year ago

You must downgrade transformers to 4.33.0:

pip install transformers==4.33.0

alexcannan commented 1 year ago

I simply added a padding_mask parameter to the streaming_llm/pos_shift/modify_llama.llama_pos_shift_attention_forward function and it works as expected, probably that's all that's needed for compatibility with the latest transformers package.

Only guessed on the typing:

def llama_pos_shift_attention_forward(
    self,
    hidden_states: torch.Tensor,
    attention_mask: Optional[torch.Tensor] = None,
    position_ids: Optional[torch.LongTensor] = None,
    past_key_value: Optional[Tuple[torch.Tensor]] = None,
    output_attentions: bool = False,
    use_cache: bool = False,
    padding_mask: Optional[torch.Tensor] = None,
) -> Tuple[torch.Tensor, Optional[torch.Tensor], Optional[Tuple[torch.Tensor]]]:

tomaarsen commented 1 year ago

That's correct. The parameter isn't actually used, it just has to exist. Downgrading transformers also works.

Guangxuan-Xiao commented 1 year ago

Thank you, @tomaarsen and @alexcannan! I've added transformers version specification in README.

mit-han-lab / streaming-llm

TypeError: llama_pos_shift_attention_forward() got an unexpected keyword argument 'padding_mask' #14