Open hiyouga opened 3 weeks ago
@ArthurZucker @muellerzr
examples
https://github.com/huggingface/transformers/pull/33932 may breaks the logic for the trainer's model_accepts_loss_kwargs. The llama model would not receive a num_items_in_batch argument, making the fix of https://github.com/huggingface/transformers/pull/34283 invalid again
model_accepts_loss_kwargs
num_items_in_batch
https://github.com/huggingface/transformers/blob/33868a057c02f0368ba63bd1edb746be38fe3d90/src/transformers/trainer.py#L605
Moreover, the names of keyword arguments are different for llama and other models, we might expect the same keyword arguments for different models.
https://github.com/huggingface/transformers/blob/33868a057c02f0368ba63bd1edb746be38fe3d90/src/transformers/models/llama/modeling_llama.py#L1146-L1161
https://github.com/huggingface/transformers/blob/33868a057c02f0368ba63bd1edb746be38fe3d90/src/transformers/models/gemma/modeling_gemma.py#L1015-L1030
The models' forward functions should have a consistent keyword argument list.
cc @muellerzr I think when reviewing I mentioned something about checking the type rather than names (like start kwarg type) would easily fix this!
System Info
Who can help?
@ArthurZucker @muellerzr
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
https://github.com/huggingface/transformers/pull/33932 may breaks the logic for the trainer's
model_accepts_loss_kwargs
. The llama model would not receive anum_items_in_batch
argument, making the fix of https://github.com/huggingface/transformers/pull/34283 invalid againhttps://github.com/huggingface/transformers/blob/33868a057c02f0368ba63bd1edb746be38fe3d90/src/transformers/trainer.py#L605
Moreover, the names of keyword arguments are different for llama and other models, we might expect the same keyword arguments for different models.
https://github.com/huggingface/transformers/blob/33868a057c02f0368ba63bd1edb746be38fe3d90/src/transformers/models/llama/modeling_llama.py#L1146-L1161
https://github.com/huggingface/transformers/blob/33868a057c02f0368ba63bd1edb746be38fe3d90/src/transformers/models/gemma/modeling_gemma.py#L1015-L1030
Expected behavior
The models' forward functions should have a consistent keyword argument list.