Closed MaciejBalaNV closed 1 month ago
@xrennvidia and @cyanguwa Could you take a look?
Hi @MaciejBalaNV
Thanks for reaching out. Could you please clarify what seqlen=1 means? Does it mean your total sequence length is 1? or sequence chunk length in each GPU is 1?
With CP, you need to guarantee your total sequence length is divisible by CP*2, so that each GPU will get at least 2 tokens.
Hi @MaciejBalaNV
Hey @xrennvidia , yes, this assert looks correct for the bug I've observed. Thanks, I believe we can close this issue.
TE Attention breaks here if sequence length is equal to 1, e.g. here. It was not the case until this commit. If seqlen==1 case is not supported, it would be nice to add a check and meaningful error message.