huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
131.7k stars 26.22k forks source link

the attention output from llama2 generate differs from other llama models #31984

Closed manyuanbin closed 2 weeks ago

manyuanbin commented 1 month ago

System Info

transformers: 4.41.2 os: ubuntu python: 3.10.14

Who can help?

No response

Information

Tasks

Reproduction

from transformers import AutoModelForCausalLM, AutoTokenizer

Load the model and tokenizer

model_name = "path/to/vicuna/model" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name)

Ensure the model returns attentions

model.config.output_attentions = True model.config.return_dict = True

input_text = "Once upon a time" input_ids = tokenizer(input_text, return_tensors='pt').input_ids

outputs = model.generate(input_ids, max_length=50, output_attentions=True, return_dict_in_generate=True)

generated_ids = outputs.sequences attention_weights = outputs.attentions

generated_text = tokenizer.decode(generated_ids[0], skip_special_tokens=True) print(generated_text)

Print the shape of the attention weights for each layer

for layer_idx, layer_attention in enumerate(attention_weights): print(f"Layer {layer_idx} attention shape: {layer_attention.shape}")

Expected behavior

image
manyuanbin commented 1 month ago

model_name = 'lmsys/vicuna-7b-v1.1'

manyuanbin commented 1 month ago

expected:

attentions (tuple(jnp.ndarray), optional, returned when output_attentions=True is passed or when config.output_attentions=True) — Tuple of jnp.ndarray (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).

ArthurZucker commented 1 month ago

hey! Can you update to a latest version of transformers and provide a proper reproduction script? 🤗

github-actions[bot] commented 3 weeks ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.