The issue comes from the backward computation of aten.mul of two complex numbers from DTensors: the result will be b + ai when it should be a + bi. Not sure why it happens -- when doing aten operations, the input tensors have been de-sugared and should have nothing to do with DTensor.
To replicate, put the following code in pytorch/test/distributed/tensor/parallel/test_tp_examples.py
The issue comes from the backward computation of
aten.mul
of two complex numbers from DTensors: the result will be b + ai
when it should be a + bi
. Not sure why it happens -- when doing aten operations, the input tensors have been de-sugared and should have nothing to do with DTensor.To replicate, put the following code in pytorch/test/distributed/tensor/parallel/test_tp_examples.py