microsoft / DeepSpeedExamples

Example models using DeepSpeed
Apache License 2.0
5.84k stars 990 forks source link

when I am running RLHF script, I encountered a error #311

Open liuzhiyong01 opened 1 year ago

liuzhiyong01 commented 1 year ago

image image My environments setting: deepspeed==0.9.0, torch==2.0.0+cu117 CUDA Version: 11.0 pretrained model is facebook/opt-350m

Who can help me solve this problem? Thanks

liuzhiyong01 commented 1 year ago

when I set enable_hybrid_engine=False, it's solved, what is the reason?

nbl97 commented 1 year ago

+1, same issue

geldarr commented 1 year ago

+1