Open wehos opened 6 months ago
The installation of dgl is not valid on HPCC systems. After the installation, dependency issue is raised:
from dance.datasets.multimodality import ModalityMatchingDataset
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/dance/datasets/__init__.py", line 1, in <module>
from dance.datasets.multimodality import JointEmbeddingNIPSDataset, ModalityMatchingDataset, ModalityPredictionDataset
File "/dance/datasets/multimodality.py", line 14, in <module>
from dance.datasets.base import BaseDataset
File "/dance/datasets/base.py", line 9, in <module>
from dance.transforms.base import BaseTransform
File "/dance/transforms/__init__.py", line 1, in <module>
from dance.transforms import graph
File "/dance/transforms/graph/__init__.py", line 1, in <module>
from dance.transforms.graph.cell_feature_graph import CellFeatureBipartiteGraph, CellFeatureGraph, PCACellFeatureGraph
File "/dance/transforms/graph/cell_feature_graph.py", line 1, in <module>
import dgl
File "/dance/lib/python3.11/site-packages/dgl/__init__.py", line 14, in <module>
from .backend import backend_name, load_backend # usort: skip
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/dance/lib/python3.11/site-packages/dgl/backend/__init__.py", line 122, in <module>
load_backend(get_preferred_backend())
File "/dance/lib/python3.11/site-packages/dgl/backend/__init__.py", line 51, in load_backend
from .._ffi.base import load_tensor_adapter # imports DGL C library
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/dance/lib/python3.11/site-packages/dgl/_ffi/base.py", line 50, in <module>
_LIB, _LIB_NAME, _DIR_NAME = _load_lib()
^^^^^^^^^^^
File "/dance/lib/python3.11/site-packages/dgl/_ffi/base.py", line 39, in _load_lib
lib = ctypes.CDLL(lib_path[0])
^^^^^^^^^^^^^^^^^^^^^^^^
File "//dance/lib/python3.11/ctypes/__init__.py", line 376, in __init__
self._handle = _dlopen(self._name, mode)
^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: libcusparse.so.11: cannot open shared object file: No such file or directory
Simply reproducible with from dance.datasets.multimodality import ModalityMatchingDataset
.
export LD_LIBRARY=~/anaconda3/lib
(replacing the path with lib directory that contains module files) can solve the issue above.
We need to remind HPCC users about this. The HPC users may need to install cudatoolkit
via conda (via conda install cudatoolkit=11.8 -c pytorch
) and set LD_LIBRARY
environ path.
https://github.com/OmicsML/dance/blob/1d94be91625a352ad8510c16d5d23b9ce5e02b53/install.sh#L58
Fixed by latest PRtorchvision=
should be replaced withtorchvision==
, otherwise 'install.sh' raises error