K-Wu / pytorch-direct_dgl

PyTorch-Direct code on top of PyTorch-1.8.0nightly (e152ca5) for Large Graph Convolutional Network Training with GPU-Oriented Data Communication Architecture (accepted by PVLDB)
https://arxiv.org/abs/2103.03330
46 stars 5 forks source link

Hello, why is it displayed that unified cannot be recognized #2

Open SC1114 opened 1 year ago

SC1114 commented 1 year ago

Using backend: pytorch Process SpawnProcess-1:Traceback (most recent call last):File "/home/csarch/anaconda3/lib/python3.8/multiprocessing/process.py", line 315,in bootstrapself.run() File "/home/csarch/anaconda3/lib/python3.8/multiprocessing/process.py", line 108, in runself. target(*self. args,**self. kwargs)File "/home/csarch/pytorch-direct/dgl/examples/pytorch/graphsage/train _sampling_pytorch direct.py", line 124, in producertrain nfeat = train nfeat.to(device="unified")RuntimeError: Expected one of cpu, cuda, xpu, mkldnn, opengl, opencl, ideep, hipmsnpu, mlc, xla, vulkan, meta, hpu device type at start of device string: unified

K-Wu commented 1 year ago

Hello,

Could you please check if you have installed the modified PyTorch we provided as a submodule at https://github.com/K-Wu/pytorch-direct/tree/ec7bdb5389ed8c9724bf257267709e43bbb4325c? If you are not sure about that, please tell us what you see when you import PyTorch and print its version number in an interactive shell.

Thank you.

SC1114 commented 1 year ago

Hi, I saw that can use the UnifiedTensor by using uva for the --graph-device (for the graph structure like CSR) and --data-device (for the node feature tensor) arguments. . What is the specific operation? Can you provide an example?Thank you so much!

K-Wu commented 1 year ago

If you are talking about the DGL UVA optimization since v0.8 detailed here https://github.com/dmlc/dgl/releases/tag/0.8.0, you need to refer to the DGL documentation because that was implemented from scratch and was independent from the prototype we made available here. I personally didn't have a chance to use that.