[DOCS] - Model outputs of RecurrentGemmaCausalLM doesn't align with the documentation

godjw commented 5 months ago

System Info

Latest docs on https://huggingface.co/docs/transformers/main/model_doc/recurrent_gemma#transformers.RecurrentGemmaForCausalLM

Who can help?

@ArthurZucker, @younesbelkada, and @stevhliu

Information

[ ] The official example scripts
[ ] My own modified scripts

Tasks

[ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[ ] My own task or dataset (give details below)

Reproduction

In the documentation for RecurrentGemmaForCausalLM's forward function, the docs read:

Returns transformers.modeling_outputs.CausalLMOutput or tuple(torch.FloatTensor) with attentions (tuple(torch.FloatTensor), optional, returned when output_attentions=True is passed or when config.output_attentions=True) — Tuple of torch.FloatTensor (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).

However, examining the source code for RecurrentGemma and RecurrentGemmaForCausalLM reveals that the model does not accept output_attentions=True and, contrary to the documentation, does not return any attention values. The relevant sections of the source code can be found here: https://github.com/huggingface/transformers/blob/47735f5f0f2752500d115d2f6bd57816032599b6/src/transformers/models/recurrent_gemma/modeling_recurrent_gemma.py#L744-L747

https://github.com/huggingface/transformers/blob/47735f5f0f2752500d115d2f6bd57816032599b6/src/transformers/models/recurrent_gemma/modeling_recurrent_gemma.py#L895-L899

Expected behavior

Modification of the document or
Modify RecurrentGemma to handle the argument output_attentions=True

ArthurZucker commented 5 months ago

Hey! Would you like to open a PR for this? 🤗 Documentation should be modified IMO unless people ask for output attentions logic, which is a feature request

github-actions[bot] commented 4 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

huggingface / transformers