Closed MartinKratochvilProgramy closed 1 year ago
You must downgrade transformers to 4.33.0
:
pip install transformers==4.33.0
I simply added a padding_mask
parameter to the streaming_llm/pos_shift/modify_llama.llama_pos_shift_attention_forward
function and it works as expected, probably that's all that's needed for compatibility with the latest transformers package.
Only guessed on the typing:
def llama_pos_shift_attention_forward(
self,
hidden_states: torch.Tensor,
attention_mask: Optional[torch.Tensor] = None,
position_ids: Optional[torch.LongTensor] = None,
past_key_value: Optional[Tuple[torch.Tensor]] = None,
output_attentions: bool = False,
use_cache: bool = False,
padding_mask: Optional[torch.Tensor] = None,
) -> Tuple[torch.Tensor, Optional[torch.Tensor], Optional[Tuple[torch.Tensor]]]:
That's correct. The parameter isn't actually used, it just has to exist. Downgrading transformers
also works.
Thank you, @tomaarsen and @alexcannan! I've added transformers version specification in README.
I'm trying to reproduce the work per readme but am getting "TypeError: llama_pos_shift_attention_forward() got an unexpected keyword argument 'padding_mask'" on line 103 in run_streaming_llama.py. I have no idea how to debug this as there is no padding_mask in the code.