Ascend / pytorch

Ascend PyTorch adapter (torch_npu). Mirror of https://gitee.com/ascend/pytorch
https://ascend.github.io/docs/
Other
234 stars 12 forks source link

Unsupported data type for HCCL process group #3

Closed iiacobac closed 11 months ago

iiacobac commented 1 year ago

Please apply the same fix as done for NCCL for the kbool limitation

follow this update

https://github.com/pytorch/pytorch/commit/366c014a7799f0b7bbc258fd6c271dadb99d1de0#diff-43cb0f438d3eb35dec0a1680ddc2d01c3ae9277d91aca4c2119d0b9ea80adeb6

sunchuhan-930 commented 11 months ago

你好,这个目前我们还不支持,后续可能会排入需求计划里

iiacobac commented 11 months ago

你好,我是Ignacio。

The fix has been applied since my comment.

You can compare https://github.com/Ascend/pytorch/blob/v2.0.3/pytorch1.8.1/src/torch/lib/c10d/ProcessGroupHCCL.cpp and where only these data types are supported
{at::kChar, HCCL_DATA_TYPE_INT8}, {at::kFloat, HCCL_DATA_TYPE_FP32}, {at::kInt, HCCL_DATA_TYPE_INT32}, {at::kHalf, HCCL_DATA_TYPE_FP16}, {at::kShort, HCCL_DATA_TYPE_INT16}, {at::kLong, HCCL_DATA_TYPE_INT64},

with https://github.com/Ascend/pytorch/blob/master/torch_npu/csrc/distributed/ProcessGroupHCCL.cpp where kbool among others, are included

{at::kByte, HCCL_DATA_TYPE_UINT8},
{at::kChar, HCCL_DATA_TYPE_INT8},
{at::kShort, HCCL_DATA_TYPE_INT16},
{at::kInt, HCCL_DATA_TYPE_INT32},
{at::kLong, HCCL_DATA_TYPE_INT64},
{at::kHalf, HCCL_DATA_TYPE_FP16},
{at::kFloat, HCCL_DATA_TYPE_FP32},
{at::kDouble, HCCL_DATA_TYPE_FP64},
{at::kBool, HCCL_DATA_TYPE_UINT8},
{at::kBFloat16, HCCL_DATA_TYPE_BFP16},
sunchuhan-930 commented 11 months ago

BF16 is not supported in the 1.8.1 version, but is supported in the current master version.