microsoft / DirectML

DirectML is a high-performance, hardware-accelerated DirectX 12 library for machine learning. DirectML provides GPU acceleration for common machine learning tasks across a broad range of supported hardware and drivers, including all DirectX 12-capable GPUs from vendors such as AMD, Intel, NVIDIA, and Qualcomm.
MIT License
2.21k stars 296 forks source link

torch-directml: The contents of tensors turn to 0 when they are moved to other DML device from another. #443

Open lshqqytiger opened 1 year ago

lshqqytiger commented 1 year ago

Tested by @Em1tSan.

$ python
>>> import torch, torch_directml
>>> dev0 = torch_directml.device(0)
>>> dev1 = torch_directml.device(1)
>>> ten = torch.randn(1)
>>> ten = ten.to(dev0)
>>> ten
tensor([0.8580], device='privateuseone:0')
>>> ten = ten.to(dev1)
>>> ten
tensor([0.], device='privateuseone:1')

But there is no problem if they are moved via cpu.

$ python
>>> import torch, torch_directml
>>> dev0 = torch_directml.device(0)
>>> dev1 = torch_directml.device(1)
>>> ten = torch.randn(1)
>>> ten = ten.to(dev0)
>>> ten
tensor([1.6672], device='privateuseone:0')
>>> ten = ten.cpu()
>>> ten
tensor([1.6672])
>>> ten = ten.to(dev1)
>>> ten
tensor([1.6672], device='privateuseone:1')

p.s. Will torch-directml support data parallelism?

Adele101 commented 1 year ago

Hi @lshqqytiger, thank you for submitting this issue. While I can't provide a timeline for resolution as the moment, please know that your feedback is valuable to us. We will follow up once we can review this issue.