Open publicstaticvo opened 1 year ago
Hi @publicstaticvo, thank you for reporting this issue. Currently, the Hybrid Engine is only supported for the OPT family of models, but additional model support (including GPT-J) is on our roadmap and in development. I will make sure to update this issue here when support for GPT-J has been added and validated. Thanks!
Describe the bug I am getting the following error while attempting to run deepspeed-chat step 3 with the actor model CarperAI/openai_summarize_tldr_sft (gpt-j 6B) and critic model CarperAI/openai_summarize_tldr_rm_checkpoint (gpt-j 6B) and ZeRO stage level 2.
ds_report output
System info (please complete the following information):
Additional context I would like to know if the pull request in https://github.com/microsoft/DeepSpeed/pull/3256 or some similar fixes can help with this issue.