Mismatched keyword argument names of llama make GA fix invalid - Githubissues

huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

https://huggingface.co/transformers

Apache License 2.0

135.38k stars 27.09k forks source link

Mismatched keyword argument names of llama make GA fix invalid #34577

Open hiyouga opened 3 weeks ago

hiyouga commented 3 weeks ago

System Info

Transformers 4.47.0.dev0 (latest commit 33868a057c02f0368ba63bd1edb746be38fe3d90)

Who can help?

@ArthurZucker @muellerzr

Information

[X] The official example scripts
[ ] My own modified scripts

Tasks

[ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[X] My own task or dataset (give details below)

Reproduction

https://github.com/huggingface/transformers/pull/33932 may breaks the logic for the trainer's model_accepts_loss_kwargs. The llama model would not receive a num_items_in_batch argument, making the fix of https://github.com/huggingface/transformers/pull/34283 invalid again

https://github.com/huggingface/transformers/blob/33868a057c02f0368ba63bd1edb746be38fe3d90/src/transformers/trainer.py#L605

Moreover, the names of keyword arguments are different for llama and other models, we might expect the same keyword arguments for different models.

https://github.com/huggingface/transformers/blob/33868a057c02f0368ba63bd1edb746be38fe3d90/src/transformers/models/llama/modeling_llama.py#L1146-L1161

https://github.com/huggingface/transformers/blob/33868a057c02f0368ba63bd1edb746be38fe3d90/src/transformers/models/gemma/modeling_gemma.py#L1015-L1030

Expected behavior

The models' forward functions should have a consistent keyword argument list.

ArthurZucker commented 1 week ago

cc @muellerzr I think when reviewing I mentioned something about checking the type rather than names (like start kwarg type) would easily fix this!