dmlc / dgl

Python package built to ease deep learning on graph, on top of existing DL frameworks.
http://dgl.ai
Apache License 2.0
13.57k stars 3.02k forks source link

Failed to import graphbolt due to `libgraphbolt_pytorch_2.3.0.post300.so` #7438

Open Rhett-Ying opened 6 months ago

Rhett-Ying commented 6 months ago

🐛 Bug

Somehow post300 is appended for the target so name which results in failure to find it as the expected name is libgraphbolt_pytorch_2.3.0.so. See more details in https://discuss.dgl.ai/t/filenotfounderror-cannot-find-dgl-c-graphbolt-library-in-dgl-2-2-1-and-pytorch-2-3-0/4419

To Reproduce

Steps to reproduce the behavior:

  1. conda install DGL package for torch 2.3.0.

Expected behavior

Environment

Additional context

alexbarghi-nv commented 5 months ago

I'm also seeing this bug - any update?

Rhett-Ying commented 5 months ago

This issue is not handled yet.

alexbarghi-nv commented 5 months ago

I think I've partly figured out the source of this bug - I tried installing again with PyTorch from the pytorch channel instead of conda-forge and that resolved the issue. There's probably a different version string or something similar in the conda-forge distribution which is causing this.

Davidxswang commented 5 months ago

I am also see this bug, installed dlg from pip, 2.2.1+cu121, for torch 2.2.2+cu121. Also tried for torch 2.3.x, not working.

Rhett-Ying commented 5 months ago

I am also see this bug, installed dlg from pip, 2.2.1+cu121, for torch 2.2.2+cu121. Also tried for torch 2.3.x, not working.

Did you install torch with pip or conda from conda-forge?

Rhett-Ying commented 5 months ago

Seems it's a common issue, we could add a reg check when loading graphbolt.

Davidxswang commented 5 months ago

I am also see this bug, installed dlg from pip, 2.2.1+cu121, for torch 2.2.2+cu121. Also tried for torch 2.3.x, not working.

Did you install torch with pip or conda from conda-forge?

I installed torch with pip

Rhett-Ying commented 5 months ago

@Davidxswang could you share your pip install command? and what is the version if check with pip list|grep torch and torch.__version__ in your case?

Davidxswang commented 5 months ago

@Davidxswang could you share your pip install command? and what is the version if check with pip list|grep torch and torch.__version__ in your case?

pip install torch==2.2.2 torchvision==0.17.2 torchaudio==2.2.2 --index-url https://download.pytorch.org/whl/cu121

pytorch-lightning            2.2.1
torch                        2.2.2+cu121
torch_geometric              2.5.2
torchaudio                   2.2.2+cu121
torchdata                    0.7.1
torchmetrics                 1.3.2
torchvision                  0.17.2+cu121
In [2]: torch.__version__
Out[2]: '2.2.2+cu121' 
Silhouettes-of-U commented 5 months ago

I got a similar error while without post300 appended:


  File "<stdin>", line 1, in <module>
  File "/root/miniconda3/lib/python3.8/site-packages/dgl/__init__.py", line 16, in <module>
    from . import (
  File "/root/miniconda3/lib/python3.8/site-packages/dgl/dataloading/__init__.py", line 13, in <module>
    from .dataloader import *
  File "/root/miniconda3/lib/python3.8/site-packages/dgl/dataloading/dataloader.py", line 27, in <module>
    from ..distributed import DistGraph
  File "/root/miniconda3/lib/python3.8/site-packages/dgl/distributed/__init__.py", line 5, in <module>
    from .dist_graph import DistGraph, DistGraphServer, edge_split, node_split
  File "/root/miniconda3/lib/python3.8/site-packages/dgl/distributed/dist_graph.py", line 11, in <module>
    from .. import backend as F, graphbolt as gb, heterograph_index
  File "/root/miniconda3/lib/python3.8/site-packages/dgl/graphbolt/__init__.py", line 36, in <module>
    load_graphbolt()
  File "/root/miniconda3/lib/python3.8/site-packages/dgl/graphbolt/__init__.py", line 26, in load_graphbolt
    raise FileNotFoundError(
FileNotFoundError: Cannot find DGL C++ graphbolt library at /root/miniconda3/lib/python3.8/site-packages/dgl/graphbolt/libgraphbolt_pytorch_2.3.1.so```
rfrs commented 5 months ago

` File "", line 1, in File "/root/miniconda3/lib/python3.8/site-packages/dgl/init.py", line 16, in from . import ( File "/root/miniconda3/lib/python3.8/site-packages/dgl/dataloading/init.py", line 13, in from .dataloader import * File "/root/miniconda3/lib/python3.8/site-packages/dgl/dataloading/dataloader.py", line 27, in from ..distributed import DistGraph File "/root/miniconda3/lib/python3.8/site-packages/dgl/distributed/init.py", line 5, in from .dist_graph import DistGraph, DistGraphServer, edge_split, node_split File "/root/miniconda3/lib/python3.8/site-packages/dgl/distributed/dist_graph.py", line 11, in from .. import backend as F, graphbolt as gb, heterograph_index File "/root/miniconda3/lib/python3.8/site-packages/dgl/graphbolt/init.py", line 36, in load_graphbolt() File "/root/miniconda3/lib/python3.8/site-packages/dgl/graphbolt/init.py", line 26, in load_graphbolt raise FileNotFoundError( FileNotFoundError: Cannot find DGL C++ graphbolt library at /root/miniconda3/lib/python3.8/site-packages/dgl/graphbolt/libgraphbolt_pytorch_2.3.1.so````

We are also facing the same issue with version 2.3.1. Any progresses? Thank you

Rhett-Ying commented 5 months ago

I got a similar error while without post300 appended:

  File "<stdin>", line 1, in <module>
  File "/root/miniconda3/lib/python3.8/site-packages/dgl/__init__.py", line 16, in <module>
    from . import (
  File "/root/miniconda3/lib/python3.8/site-packages/dgl/dataloading/__init__.py", line 13, in <module>
    from .dataloader import *
  File "/root/miniconda3/lib/python3.8/site-packages/dgl/dataloading/dataloader.py", line 27, in <module>
    from ..distributed import DistGraph
  File "/root/miniconda3/lib/python3.8/site-packages/dgl/distributed/__init__.py", line 5, in <module>
    from .dist_graph import DistGraph, DistGraphServer, edge_split, node_split
  File "/root/miniconda3/lib/python3.8/site-packages/dgl/distributed/dist_graph.py", line 11, in <module>
    from .. import backend as F, graphbolt as gb, heterograph_index
  File "/root/miniconda3/lib/python3.8/site-packages/dgl/graphbolt/__init__.py", line 36, in <module>
    load_graphbolt()
  File "/root/miniconda3/lib/python3.8/site-packages/dgl/graphbolt/__init__.py", line 26, in load_graphbolt
    raise FileNotFoundError(
FileNotFoundError: Cannot find DGL C++ graphbolt library at /root/miniconda3/lib/python3.8/site-packages/dgl/graphbolt/libgraphbolt_pytorch_2.3.1.so```

This is expected as DGL 2.2 does not support torch 2.3.1 yet. The latest supported torch version is 2.3.0

Rhett-Ying commented 5 months ago

@rfrs This is expected as DGL 2.2 does not support torch 2.3.1 yet. The latest supported torch version is 2.3.0

jbm-composer commented 5 months ago

I installed using the 2.3.x version, cuda 12.1, conda (from the DGL website), with torch 2.3.0, and I'm seeing:

>>> import dgl
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/james/src/jbm/dgl/python/dgl/__init__.py", line 16, in <module>
    from . import (
  File "/home/james/src/jbm/dgl/python/dgl/dataloading/__init__.py", line 13, in <module>
    from .dataloader import *
  File "/home/james/src/jbm/dgl/python/dgl/dataloading/dataloader.py", line 27, in <module>
    from ..distributed import DistGraph
  File "/home/james/src/jbm/dgl/python/dgl/distributed/__init__.py", line 5, in <module>
    from .dist_graph import DistGraph, DistGraphServer, edge_split, node_split
  File "/home/james/src/jbm/dgl/python/dgl/distributed/dist_graph.py", line 12, in <module>
    from .. import backend as F, graphbolt as gb, heterograph_index
  File "/home/james/src/jbm/dgl/python/dgl/graphbolt/__init__.py", line 36, in <module>
    load_graphbolt()
  File "/home/james/src/jbm/dgl/python/dgl/graphbolt/__init__.py", line 26, in load_graphbolt
    raise FileNotFoundError(
FileNotFoundError: Cannot find DGL C++ graphbolt library at /home/james/src/jbm/dgl/python/dgl/graphbolt/libgraphbolt_pytorch_2.3.0.so

I tried with conda sourcing from both conda-forge and pytorch, btw

I also (just) tried uninstalling 2.3.x and install 2.2.x, but same error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/james/src/jbm/dgl/python/dgl/__init__.py", line 16, in <module>
    from . import (
  File "/home/james/src/jbm/dgl/python/dgl/dataloading/__init__.py", line 13, in <module>
    from .dataloader import *
  File "/home/james/src/jbm/dgl/python/dgl/dataloading/dataloader.py", line 27, in <module>
    from ..distributed import DistGraph
  File "/home/james/src/jbm/dgl/python/dgl/distributed/__init__.py", line 5, in <module>
    from .dist_graph import DistGraph, DistGraphServer, edge_split, node_split
  File "/home/james/src/jbm/dgl/python/dgl/distributed/dist_graph.py", line 12, in <module>
    from .. import backend as F, graphbolt as gb, heterograph_index
  File "/home/james/src/jbm/dgl/python/dgl/graphbolt/__init__.py", line 36, in <module>
    load_graphbolt()
  File "/home/james/src/jbm/dgl/python/dgl/graphbolt/__init__.py", line 26, in load_graphbolt
    raise FileNotFoundError(
FileNotFoundError: Cannot find DGL C++ graphbolt library at /home/james/src/jbm/dgl/python/dgl/graphbolt/libgraphbolt_pytorch_2.3.0.so
Livvi commented 4 months ago

Are there any news on this? I'm also currently failing to set up the dgl library and always get the same DGL C++ graphbolt library error shown above..

Rhett-Ying commented 4 months ago

Are there any news on this? I'm also currently failing to set up the dgl library and always get the same DGL C++ graphbolt library error shown above..

could you list what files exist under //dgl/graphbolt/ after you installed?

Rhett-Ying commented 4 months ago

Is *.post300 a post-release version with additional bug fixes? https://discuss.pytorch.org/t/why-torch-version-returns-2-3-1-post300/206486

LZVSDY commented 1 month ago

install https://www.dgl.ai/pages/start.html The versions of CUDA, PyTorch, and DGL must be compatible with each other.