Hi, I am trying to use fp8 with TransformerEngine. I am using a version of GPT-Neox repo, which uses deepspeed.
I can get fp8 to run in my MLPs with model parallel, but when I use pipeline parallel, it gets hung up with no errors, it's just frozen once it tries to train (and goes into the DeepSpeed library call).
I was wondering if anyone else ran into this issue. Thanks!
Hi, I am trying to use fp8 with TransformerEngine. I am using a version of GPT-Neox repo, which uses deepspeed.
I can get fp8 to run in my MLPs with model parallel, but when I use pipeline parallel, it gets hung up with no errors, it's just frozen once it tries to train (and goes into the DeepSpeed library call).
I was wondering if anyone else ran into this issue. Thanks!