torch-directml: The contents of tensors turn to 0 when they are moved to other DML device from another.

microsoft / DirectML

DirectML is a high-performance, hardware-accelerated DirectX 12 library for machine learning. DirectML provides GPU acceleration for common machine learning tasks across a broad range of supported hardware and drivers, including all DirectX 12-capable GPUs from vendors such as AMD, Intel, NVIDIA, and Qualcomm.

MIT License

2.21k stars 296 forks source link

Tested by @Em1tSan.

$ python
>>> import torch, torch_directml
>>> dev0 = torch_directml.device(0)
>>> dev1 = torch_directml.device(1)
>>> ten = torch.randn(1)
>>> ten = ten.to(dev0)
>>> ten
tensor([0.8580], device='privateuseone:0')
>>> ten = ten.to(dev1)
>>> ten
tensor([0.], device='privateuseone:1')

But there is no problem if they are moved via cpu.

$ python
>>> import torch, torch_directml
>>> dev0 = torch_directml.device(0)
>>> dev1 = torch_directml.device(1)
>>> ten = torch.randn(1)
>>> ten = ten.to(dev0)
>>> ten
tensor([1.6672], device='privateuseone:0')
>>> ten = ten.cpu()
>>> ten
tensor([1.6672])
>>> ten = ten.to(dev1)
>>> ten
tensor([1.6672], device='privateuseone:1')

p.s. Will torch-directml support data parallelism?

microsoft / DirectML

torch-directml: The contents of tensors turn to 0 when they are moved to other DML device from another. #443