Closed cowanmeg closed 2 days ago
Sequence parallel forward transformer layer and multi-headed attention tests.
!build
check DistributedTransformerTest.MultiheadAttention_SP/__half !test
DistributedTransformerTest.MultiheadAttention_SP/__half
!test
Sequence parallel forward transformer layer and multi-headed attention tests.