huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
132.7k stars 26.44k forks source link

Different results obtained using pipeline (worse) vs. model.generate under the same decoding strategy #33697

Open kirk86 opened 3 days ago

kirk86 commented 3 days ago

System Info

4.44 Linux 3.12

Who can help?

@Rocketknight1 @gante

Information

Tasks

Reproduction

  1. Train a small model like T5 on a small synthetic data for summarization
  2. Evaluate the model on the test set
  3. Results match validation accuracy from train logs when using model.generate with default decoding_strategy = greedy search when setting do_sample=False, num_beams=1
  4. Try to replicate the same using the pipeline but results are completely off $\pm 50\%$
  5. Double checked all parameters to be set to default values as in model.generate still the same issue.
  6. Iterated through all possible decoding_strategies and still no sign of correction when using pipeline

Expected behavior

Expected the behaviour of pipeline to be the same as that of model.generate when using the same underlying decoding_strategy

Rocketknight1 commented 3 days ago

Hi @kirk86 - could you upload your model (or reproduce the issue with an existing T5 model on the Hub), then share a short but complete reproducer script that shows the discrepancy? It'll make it a lot easier for us to track down any bugs!