GenerationConfig is not handled correctly when saving multi-task models

B-Step62 commented 1 week ago

System Info

transformers version: 4.45.1 (installed from master branch via pip install git+https://github.com/huggingface/transformers)
Platform: macOS-14.6.1-arm64-arm-64bit
Python version: 3.8.13
Huggingface_hub version: 0.24.6
Safetensors version: 0.4.2
Accelerate version: 0.27.2
Accelerate config: not found
PyTorch version (GPU?): 2.1.0 (False)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using distributed or parallel set-up in script?:

Who can help?

@ArthurZucker @Rocketknight1

Information

[ ] The official example scripts
[X] My own modified scripts

Tasks

[X] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[ ] My own task or dataset (give details below)

Reproduction

Load T5 model.

model = T5ForConditionalGeneration.from_pretrained("t5-small")

Construct a pipeline from it for inference

pipe = transformers.pipeline(
model=model,
tokenizer=transformers.T5TokenizerFast.from_pretrained("t5-small", model_max_length=100), 
task="translation_en_to_de"
)

Save the pretrained weight locally (typically after fine-tuning but doing it immediately for quick reproducing purpose)
```
model.save_pretrained("/tmp/transformers/t5")
```
The saved config.json file contains "early_stopping": null. This does not present in the original config.json file in the T5 model repository.
```
{
"_name_or_path": "t5-small",
...
"early_stopping": null,
}
```

Model loading fails due to this null value.

T5ForConditionalGeneration.from_pretrained("/tmp/transformers/t5")
# >  ValueError: `early_stopping` must be a boolean or 'never', but is None.

A few side notes

This does not happen with the released version e.g. transformers==4.44.2
This does not happen when the model does not have task_specific_params i.e. not designed for multiple tasks.
This does not happen when we don't use the model for pipeline. Thereby, I suspect pipeline construction logic updates model generation config, but it is not reflected to the save_pretrained logic nicely.
- Indeed, the model.generation_config property returns different result before and after the step 2 (pipeline construction)
```
# Before pipeline construction
GenerationConfig {
"decoder_start_token_id": 0,
"eos_token_id": 1,
"pad_token_id": 0
}
```
```
# After pipeline construction
GenerationConfig {
"decoder_start_token_id": 0,
"early_stopping": true,
"eos_token_id": 1,
"max_length": 300,
"num_beams": 4,
"pad_token_id": 0
}
```

Expected behavior

The saved config.json file should not contain the "early_stopping" key at top-level. It should be only defined under the task_specific_params like the original config.

LysandreJik commented 1 week ago

cc @gante regarding the generation config

gante commented 5 days ago

Hi @B-Step62 👋 Thank you for opening this issue!

Indeed, there are multiple issues here:

calling a pipeline should not mutate the model config
a model that can be saved must also be able to be loaded [opening a PR to fix them]

The question of the task_specific_params is a bit more tricky: the generation config is meant to replace them, but we do not have the authority to update all hub models :) Inside the pipeline code, we already load all task parameters into model.generation_config

huggingface / transformers