dmlc / dgl

Python package built to ease deep learning on graph, on top of existing DL frameworks.
http://dgl.ai
Apache License 2.0
13.19k stars 2.99k forks source link

Unable to install dgl==2.2.1 with pip #7433

Open YuanbinLiu opened 1 month ago

YuanbinLiu commented 1 month ago

Hi DGL Team,

I'm experiencing an issue when trying to install version 2.2.1 of DGL using pip. Here are the details of the issue: ERROR: Could not find a version that satisfies the requirement dgl==2.2.1 (from versions: 1.1.0, 1.1.1, 1.1.2, 1.1.2.post1, 1.1.3, 2.0.0, 2.1.0, 2.2.0) ERROR: No matching distribution found for dgl==2.2.1.

The version of pip is 24.0.

mfbalin commented 1 month ago

What command did you use to install dgl?

harrylee999 commented 1 month ago

我是这个(from versions: 0.1.0, 0.1.2, 0.1.3, 0.6.0, 0.6.0.post1, 0.6.1, 0.9.0, 0.9.1, 1.0.0, 1.0.1, 1.0.4, 1.1.0, 1.1.1, 1.1.2, 1.1.3, 2.1.0) 为啥你有2.2.0的 你的pypi源是什么呀

YuanbinLiu commented 1 month ago

I am using pip install dgl==2.2.1 @mfbalin

mfbalin commented 1 month ago

pip install dgl -f https://data.dgl.ai/wheels/torch-2.3/repo.html

mfbalin commented 1 month ago

https://www.dgl.ai/pages/start.html

Instructions are here.

YuanbinLiu commented 1 month ago

I used this instruction before. But what I got was still 2.2.0 (testing on MacOS, but it works well on linux). And when I imoprt dgl of 2.2.0, I always get the issue: ImportError: Cannot load Graphbolt C++ library. This is why I'd like to upgrade it to 2.2.1

harrylee999 commented 1 month ago

I used this instruction before. But what I got was still 2.2.0. And when I imoprt dgl of 2.2.0, I always get the issue: ImportError: Cannot load Graphbolt C++ library. This is why I'd like to upgrade it to 2.2.1

I have already install 2.2.1 but i still get ImportError: Cannot load Graphbolt C++ library

mfbalin commented 1 month ago

@Rhett-Ying What do you think the cause may be?

Rhett-Ying commented 1 month ago

@harrylee999 what's your platform? OS version? glibc version?

Rhett-Ying commented 1 month ago

@YuanbinLiu are you hitting install issue on MacOS? try with this one: pip install dgl -f https://data.dgl.ai/wheels/repo.html ? please uninstall previously installed ones before install new one.

YuanbinLiu commented 1 month ago

Based on your suggested command line, the 2.2.1 has been installed. But still encountered the issue: ImportError: Cannot load Graphbolt C++ library.

Rhett-Ying commented 1 month ago

Based on your suggested command line, the 2.2.1 has been installed. But still encountered the issue: ImportError: Cannot load Graphbolt C++ library.

what is your OS and OS version? Could you try to build from source on your own via referring to https://docs.dgl.ai/install/index.html#install-from-source ?

BrunoLiegiBastonLiegi commented 1 month ago

I am experiencing the same problem, which seems similar to #7247. I am running ubuntu 24.04 with glibc 2.39. I tried different versions of pytorch (2.1.0, 2.2.0, 2.2.1, 2.3.0) with cuda 12.1 and python 3.10 and 3.12. I always install with the command suggested in the get started. dgl gets installed correctly apparently, but then simply trying to import it results in an error. In most cases a Cannot load Graphbolt C++ library error is triggered by a OSError: libnvrtc.so.12: cannot open shared object file: No such file or directory error, even though cuda-nvrtc is installed. However, with torch2.1.0 I get a OSError: libcusparse.so.12: cannot open shared object file: No such file or directory.

EDIT: yes I can confirm that up to torch 2.1.2 I get the cusparse error, whereas from torch 2.2.0 on I get nvrtc error.

harrylee999 commented 1 month ago

I am experiencing the same problem, which seems similar to #7247. I am running ubuntu 24.04 with glibc 2.39. I tried different versions of pytorch (2.1.0, 2.2.0, 2.2.1, 2.3.0) with cuda 12.1 and python 3.10 and 3.12. I always install with the command suggested in the get started. dgl gets installed correctly apparently, but then simply trying to import it results in an error. In most cases a Cannot load Graphbolt C++ library error is triggered by a OSError: libnvrtc.so.12: cannot open shared object file: No such file or directory error, even though cuda-nvrtc is installed. However, with torch2.1.0 I get a OSError: libcusparse.so.12: cannot open shared object file: No such file or directory.

EDIT: yes I can confirm that up to torch 2.1.2 I get the cusparse error, whereas from torch 2.2.0 on I get nvrtc error.

hey bro. It works fine when i use torch-2.3.0 、dgl-2.2.1 and cuda-12.1.

Rhett-Ying commented 1 month ago

@BrunoLiegiBastonLiegi what's your installed cuda toolkit version?

BrunoLiegiBastonLiegi commented 1 month ago

I tried with torch 2.3.0 as well but still got the same nvrtc error, I'll try again just to doublecheck though. I did not install the cuda-toolkit manually myself, I think the needed cuda packages are installed by pytorch indirectly. Anyway, this is what I have in the environment:

nvidia-cublas-cu12       12.1.3.1
nvidia-cuda-cupti-cu12   12.1.105
nvidia-cuda-nvrtc-cu12   12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12        8.9.2.26
nvidia-cufft-cu12        11.0.2.54
nvidia-curand-cu12       10.3.2.106
nvidia-cusolver-cu12     11.4.5.107
nvidia-cusparse-cu12     12.1.0.106
nvidia-nccl-cu12         2.19.3
nvidia-nvjitlink-cu12    12.1.105
nvidia-nvtx-cu12         12.1.105
Rhett-Ying commented 1 month ago

@BrunoLiegiBastonLiegi Is it possible for you to install cuda toolkit from https://developer.nvidia.com/cuda-12-1-0-download-archive instead of conda?

BrunoLiegiBastonLiegi commented 1 month ago

Thanks @Rhett-Ying, sorry for the delay but I had a very hectic week. Anyway, manually installing the cuda toolkit seems to have fixed it. Even though I believe there is a problem with the latest ubuntu as installing it system wide through apt failed initially due to a package, nsight-systems-2023.1.2, that depends on libtinfo5 which has no candidates in ubuntu 24.04 (which provides libtinfo6 only). Manually downloading and installing the .deb worked though, and got dgl running with torch 2.3.0.

Now I have to figure out how to deal with this on a cluster where I don't root privileges...

github-actions[bot] commented 4 days ago

This issue has been automatically marked as stale due to lack of activity. It will be closed if no further activity occurs. Thank you