Open yinfangchen opened 7 months ago
I got an AssertionError: Mask is silently ignored due to the use of a custom kernel when training GPT-2 with examples/pretrain_gpt.sh.
AssertionError: Mask is silently ignored due to the use of a custom kernel
examples/pretrain_gpt.sh
This line leads to the assertion error: https://github.com/bigscience-workshop/Megatron-DeepSpeed/blob/8387ae17c4704f6579f88a84500b535d19d7fbbf/megatron/model/fused_softmax.py#L191
Is this assertion necessary? And is it even correct?
same puzzlement
I got an
AssertionError: Mask is silently ignored due to the use of a custom kernel
when training GPT-2 withexamples/pretrain_gpt.sh
.This line leads to the assertion error: https://github.com/bigscience-workshop/Megatron-DeepSpeed/blob/8387ae17c4704f6579f88a84500b535d19d7fbbf/megatron/model/fused_softmax.py#L191
Is this assertion necessary? And is it even correct?