microsoft / DeepSpeedExamples

Example models using DeepSpeed
Apache License 2.0
6.02k stars 1.02k forks source link

The min_length setting force the model generate to max length, which produce repeated or nonsense result #539

Open TheEighthDay opened 1 year ago

TheEighthDay commented 1 year ago

https://github.com/microsoft/DeepSpeedExamples/blob/8f8099a813f3b223d5df39e0c15c748de4eb1669/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/ppo_trainer.py#L76

When i try to reproduce bloom, i meet the same problem: "The min_length setting force the model generate to max length, which produce repeated or nonsense result." fix ppo_trainer generate and scores calculation in stage 2

So i try to delete the "min_length setting", but i find the program can't continue to run at https://github.com/microsoft/DeepSpeedExamples/blob/8f8099a813f3b223d5df39e0c15c748de4eb1669/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/ppo_trainer.py#L105

Junyiliu0 commented 1 year ago

Did you solve the problem? I met the same case.