TypeError: UbufP2PCommOverlap(): incompatible function arguments.

NVIDIA / TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.

Apache License 2.0

1.61k stars 256 forks source link

Hello, When I configured --sequence-parallel and --tp-comm-overlap and started the training. It shows below information: TypeError: UbufP2PCommOverlap(): incompatible function arguments. The following argument types are supported:

() -> None

Invoked with: tensor([[0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], ..., [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0]], device='cuda:3', dtype=torch.uint8), 3, 2, 16, 2, 0, 0, 3, 0, 0, tensor([]) How to fix it? Thanks.

NVIDIA / TransformerEngine

TypeError: UbufP2PCommOverlap(): incompatible function arguments. #870