huggingface / trl

Train transformer language models with reinforcement learning.
http://hf.co/docs/trl
Apache License 2.0
10k stars 1.27k forks source link

RuntimeError: size mismatch when train with llama 7b/30b (deepspeed zero3) #360

Closed kebijuelun closed 1 year ago

kebijuelun commented 1 year ago

Does anyone have some idea about this error? or can anyone provide a example deepspeed config which can run successfully?

Thanks.

kebijuelun commented 1 year ago

And I also met some error when train rl model with accelerate and deepspeed. The error is different with different transformers version.

Could anyone give an example deepspeed zero3 config and the specific env list, which can train rmo and rl successfully? Thanks.

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.