Closed chaitjo closed 3 months ago
Thanks for bringing this up! That's a good point, I think we've been taking a lot of the dependencies for granted and we'll update the documentation.
Nominally, PyG since a few versions ago, a lot of the PyG core functionality has been upstreamed to be PyTorch (e.g. torch_scatter
stuff), but not everything; that means for the most part, PyG by itself should work out of the box on XPUs, however functionality that exists outside - torch_scatter
, torch_cluster
, torch_sparse
- aren't supported yet. So the error you're seeing is basically the low level implementation for knn_graph
only exists for CUDA or for CPUs, and it's expecting a tensor that resides on the latter.
I'm not 100% sure what our plans are for supporting those supplementary libraries, and so they might need to be treated on a case-by-case basis. Please reach out to me via email or Slack and we can discuss this further (even if it's not matsciml
related). I'll keep this issue up still, since I agree we do need to update our PyG + XPU instructions.
Thanks!
What's the current recommended way to installing PyG?
I'm currently using:
pip install torch_geometric
pip install torch-scatter torch-cluster
..and this seems fine unless I need some of the functions from torch-cluster to be run on tensors which are located on XPUs. PyG's doc also states regarding torch-scatter and torch-cluster that these packages 'come with their own CPU and GPU kernel implementations based on the PyTorch C++/CUDA/hip(ROCm) extension interface.' So I suppose there's no real fix yet for my particular usecase apart from shifting my computation to the CPU.
Those pip
commands should work. If you are super paranoid, you can tack on --no-cache-dirs
to make sure you're not using a cached version, and also --no-binary :all:
to make sure it's built from source. If you have issues, I'd suggest you step through those :)
I've brought up torch_cluster
support internally on some things we can potentially do, but will require some time. I'll send you an email separately.
@chaitjo do you think I can close this issue?
Yes please.
On Wed, 29 May 2024 at 4:30 PM, Kelvin Lee @.***> wrote:
@chaitjo https://github.com/chaitjo do you think I can close this issue?
198 https://github.com/IntelLabs/matsciml/pull/198 updated the README,
and I think it should be pretty complete - within the bounds of the current status of broader framework support
— Reply to this email directly, view it on GitHub https://github.com/IntelLabs/matsciml/issues/166#issuecomment-2137699413, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABUNYNIGUATPA5OXDECC2N3ZEXYCBAVCNFSM6AAAAABFGEFOHOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMZXGY4TSNBRGM . You are receiving this because you were mentioned.Message ID: @.***>
Feature/behavior summary
I'm trying to get PyG to install and work well with Intel XPUs, and was hoping to use this repository as reference. At present, I see that PyG is never installed by default, and nor are any instructions for setting it up with XPUs available.
Request attributes
Related issues
No response
Solution description
Unknown.
Additional notes
At present, working with a different repository (https://github.com/a-r-j/ProteinWorkshop), I've been trying to integrate your code for the XPU as a new accelerator in PyTorch Lightning: https://github.com/IntelLabs/matsciml/blob/main/matsciml/lightning/xpu.py.
So far, I'm able to get my trainer to identify the XPU as a device, but it seems like some torch_cluster operations are not compatible with tensor stored on XPUs. I would like to perform torch_cluster operations such as knn graph creation on XPU tensors so that I can do data processing in a batched manner or on-the-fly, as opposed to on the CPU.
Here is a minimal example which fails:
The resulting error is
RuntimeError: x.device().is_cpu() INTERNAL ASSERT FAILED at "csrc/cpu/knn_cpu.cpp":12, please report a bug to PyTorch. x must be CPU tensor
.And here's a longer trace from the ProteinWorkshop codebase, which probably won't make any sense to MatSciML maintainers.