DeepGraphLearning / graphvite

GraphVite: A General and High-performance Graph Embedding System
https://graphvite.io
Apache License 2.0
1.22k stars 151 forks source link

undefined symbol: _ZN3fLS13FLAGS_log_dirE #23

Closed husong998 closed 4 years ago

husong998 commented 5 years ago

Hi,

When I was trying to install it from source I got error when executing python setup.py install

The message is as follow:

Traceback (most recent call last):
  File "setup.py", line 22, in <module>
    from graphvite import __version__, lib_path, lib_file
  File "/home/husong/graphvite/python/graphvite/__init__.py", line 36, in <module>
    lib = imp.load_dynamic("libgraphvite", lib_file)
ImportError: /home/husong/graphvite/python/graphvite/../../build/lib/libgraphvite.so: undefined symbol: _ZN3fLS13FLAGS_log_dirE

Any idea why this is happening? How can I get around this problem?

KiddoZhu commented 5 years ago

Maybe the library can't find gflags at runtime. Could you try to export the paths of gflags and glog into LD_LIBRARY_PATH? These paths can be found in the output of cmake.

husong998 commented 5 years ago

Thanks for the prompt answer! But my LD_LIBRARY_PATH environment variable has already included the path to libglog.so, as can be seen in the terminal output:

(base) [husong@xtrap100 graphvite]$ env |grep PATH
LD_LIBRARY_PATH=/opt/gcc-4.9/lib64::/usr/local/lib:/usr/local/cuda/lib64:/opt/lib/
PATH=/home/husong/miniconda2/bin:/home/husong/miniconda2/condabin:/opt/gcc-4.9/bin:/usr/local/cuda/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/usr/local/openssl/bin:/usr/local/php/bin:/usr/local/mysql/bin:/opt/bin:/opt/bin:/home/husong/.local/bin:/home/husong/bin
(base) [husong@xtrap100 graphvite]$ whereis libglog
libglog: /usr/local/lib/libglog.la /usr/local/lib/libglog.so /usr/local/lib/libglog.a

Any other possible reason for the error? Or is it possible for you to provide some more details about your environment where you succeeded to build?

KiddoZhu commented 5 years ago

My gflags and glog are installed by conda and I export the path of env/env-name/lib.

I guess that you might have multiple versions of gflags or glog. You can check the output of cmake and see which one it uses. Because cmake doesn't use LD_LIBRARY_PATH for search, sometimes it uses another one.

shiqiaodeng commented 4 years ago

I have the same problem,and I I export the path of env/env-name/lib. but I didn't solve this problem. (graphvite) [deng@node2 python]$ echo $LD_LIBRARY_PATH /home/deng/usr/local/faiss_gpu/lib:/home/deng/usr/local/OpenBLAS_install/lib:/home/deng/usr/local/lapack/lib:/home/deng/anaconda3/envs/graphvite/lib/:/home/deng/usr/local/faiss_gpu/lib:/home/deng/usr/local/OpenBLAS_install/lib:/home/deng/usr/local/lapack/lib:/home/deng/anaconda3/envs/graphvite/lib/:/home/deng/usr/local/faiss_gpu/lib:/home/deng/usr/local/OpenBLAS_install/lib:/home/deng/usr/local/lapack/lib:/home/deng/anaconda3/envs/graphvite/lib/:/usr/local/cuda/lib64:/home/deng/anaconda3/envs/graphvite/lib:/home/deng/anaconda3/envs/graphvite/lib:/home/deng/anaconda3/envs/graphvite/lib Then,I use un commond to view undefined symbols: (graphvite) [deng@node2 python]$ nm -u /home/deng/project/graphvite-0.1.0/build/lib/libgraphvite.so

                 U _ZN3fLB16FLAGS_log_prefixE
                 U _ZN3fLB17FLAGS_logtostderrE
                 U _ZN3fLI17FLAGS_minloglevelE
                 U _ZN3fLS13FLAGS_log_dirE
                 U _ZN5faiss3gpu14GpuIndexFlatL2C1EPNS0_12GpuResourcesEiNS0_18GpuIndexFlatConfigE
                 U _ZN5faiss3gpu20StandardGpuResourcesC1Ev
                 U _ZN5faiss3gpu20StandardGpuResourcesD1Ev
                 U _ZN6google10LogMessage6streamEv
                 U _ZN6google10LogMessage9SendToLogEv
                 U _ZN6google10LogMessageC1EPKci
                 U _ZN6google10LogMessageC1EPKcii
                 U _ZN6google10LogMessageC1EPKciiiMS0_FvvE
                 U _ZN6google10LogMessageD1Ev
                 U _ZN6google15LogMessageFatalC1EPKci
                 U _ZN6google15LogMessageFatalD1Ev
                 U _ZN6google17InitGoogleLoggingEPKc

Could you give me some advice?Thanks! @KiddoZhu

KiddoZhu commented 4 years ago

Could you check the runtime path in libgraphvite.so? My output is something like

readelf -d libgraphvite.so | grep PATH
0x000000000000001d (RUNPATH)            Library runpath: [$ORIGIN:/path/to/env/lib]

These are paths that take into effect when the binary is loaded. $ORIGIN corresponds to the directory where libgraphvite.so lies in.

If you have a different path, you can change it by chrpath -r $ORIGIN:/path/to/env/lib libgraphvite.so

Tell me if it helps.

shiqiaodeng commented 4 years ago

Thank for you reply.

(graphvite) [deng@node2 lib]$ readelf -d libgraphvite.so | grep PATH
 0x000000000000000f (RPATH)              Library rpath: [$ORIGIN:/home/deng/anaconda3/envs/graphvite/lib:/home/deng/usr/local/faiss_gpu]

The path does not seem to be wrong.But still didn't help.May be other questions? @KiddoZhu

KiddoZhu commented 4 years ago

I am not sure how to solve this problem.

Maybe you can try adding soft links to libglog.so, libgflags.so and libfaiss.so in build/lib? The soft link to faiss should have already been generated automatically.

shiqiaodeng commented 4 years ago

Thank for you reply ! I used a variety of methods but didn't solve the problem. I plan to learn other things first, and then I will study this issue later. If it is solved, I will tell you my method for the first time.Here are my two installations of graphvite blogging: https://blog.csdn.net/Bridge3/article/details/101564323 https://blog.csdn.net/Bridge3/article/details/101938392

If there is an error, you are welcome to point out!Thanks!

shiqiaodeng commented 4 years ago

Hi! I just solved this problem. details as follows: vi io.h Then, comment out this part of the code:

void init_logging(int threshold = google::INFO, std::string dir = "", bool verbose = false) {
    static bool initialized = false;

    FLAGS_minloglevel = threshold;
    if (dir == "")
        FLAGS_logtostderr = true;
/**    else
        FLAGS_log_dir = dir;**/        // Comment out this part
    FLAGS_log_prefix = verbose;
    if (!initialized) {
        google::InitGoogleLogging("graphvite");
        initialized = true;
    }
}

We think this symbol(FLAGS_log_dir) is only used in io.h. Although we don't know its function, we decided to try to comment it out. Finally, the comments can be successfully installed. image

But we have encountered new problems at the beginning of the quick start. graphvite baseline quick start error:


loading graph from /home/deng/.graphvite/dataset/blogcatalog/blogcatalog_train.txt
0.00018387%
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
Graph<uint32>
------------------ Graph -------------------
#vertex: 10312, #edge: 333983
as undirected: yes, normalization: no
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
[time] GraphApplication.load: 0.282339 s
[time] GraphApplication.build: 0.598882 s
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
GraphSolver<128, float32, uint32>
----------------- Resource -----------------
#worker: 1, #sampler: 7, #partition: 1
tied weights: no, episode size: 500
gpu memory limit: 15.3 GiB
gpu memory cost: 51.5 MiB
----------------- Sampling -----------------
augmentation step: 2, shuffle base: 2
random walk length: 40
random walk batch size: 100
#negative: 1, negative sample exponent: 0.75
----------------- Training -----------------
model: LINE
optimizer: SGD
learning rate: 0.025, lr schedule: linear
weight decay: 0.005
#epoch: 2000, batch size: 100000
resume: no
positive reuse: 1, negative weight: 5
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Batch id: 0 / 6679
loss = 0
Batch id: 1000 / 6679
loss = 0.388631
Batch id: 2000 / 6679
loss = 0.383216
Batch id: 3000 / 6679
loss = 0.380334
Batch id: 4000 / 6679
loss = 0.376892
Batch id: 5000 / 6679
loss = 0.373871
Batch id: 6000 / 6679
loss = 0.372043
[time] GraphApplication.train: 11.2109 s
evaluate on node classification
effective labels: 14476 / 14476
OMP: Error #13: Assertion failure at z_Linux_util.cpp(2361).
OMP: Hint Please submit a bug report with this message, compile and run commands used, and machine configuration info including native compiler and operating system versions. Faster response will be obtained by including all program sources. For information on submitting this issue, please see http://www.intel.com/software/products/support/.```
KiddoZhu commented 4 years ago

FLAGS_log_dir is a variable in gflags. If specified, glog outputs will be also written to the directory. According to your findings, I guess it is due to some compatibility issues of different glog & gflags versions. You can remove it if you don't need to log into files.

For the OMP problem, I never encountered that before. GraphVite doesn't directly depend on OMP, so it might be a bug when invoking PyTorch on multi-GPU for evaluation. This might be a fix for that.

Also, you may skip the evaluation stage with graphvite baseline quick start --no-eval.

shiqiaodeng commented 4 years ago

Thanks for your advice! I have solved the problem of OMP. It is a bug caused by unstable version. I reinstalled the version intel-openmp=2019.4. Then This problem is solved. image

nm-narasimha commented 4 years ago

This issue persists even after installing intel-openmp=2019.4. Can someone please consolidate these requirements and update in requirements file or add necessary steps user should run?

suamin commented 2 years ago

The issue continued to persist for me as well (like #89), I solved as follows:

conda create -n graphvite python=3.7
conda activate graphvite
conda install -c pytorch faiss-gpu=1.6.3 cudatoolkit=10.1 # note this will also install openmp (cf. intel-openmp-2021.3.0)
conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 -c pytorch
pip install pybind11
git clone https://github.com/DeepGraphLearning/graphvite.git
cd graphvite
nano conda/requirements.txt 

Update as follows (some requirements commented out):

# cmake
cmake >=3.12
gxx_linux-64 >=5.4
glog
gflags
#cudatoolkit >=9.2
#python
#pybind11

# make
#mkl >=2018

# run
#numpy >=1.11
pyyaml
conda-forge::easydict
six
future
imageio
psutil
scipy
matplotlib
#pytorch
#torchvision
nltk

Install remaining dependencies and then install gaphvite from conda:

conda install -y --file conda/requirements.txt # might take a bit of time to complete
conda install -c milagraph graphvite

Please check if you see the following files:

ls /path/to/envs/graphvite/lib/python3.7/site-packages/graphvite
libfaiss.so  libgraphvite.so

At this point, open python:

>>> import graphvite
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/netscratch/samin/dev/miniconda3/envs/graphvite/lib/python3.7/site-packages/graphvite/__init__.py", line 36, in <module>
    lib = imp.load_dynamic("libgraphvite", lib_file)
  File "/netscratch/samin/dev/miniconda3/envs/graphvite/lib/python3.7/imp.py", line 342, in load_dynamic
    return _load(spec)
ImportError: /netscratch/samin/dev/miniconda3/envs/graphvite/lib/python3.7/site-packages/graphvite/lib/libgraphvite.so: undefined symbol: _ZN6google10LogMessageC1EPKciiiMS0_FvvE

And check: echo $LD_LIBRARY_PATH

/usr/local/cuda/compat/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64

Find libglog:

whereis libglog
libglog: /usr/lib/x86_64-linux-gnu/libglog.so /usr/lib/x86_64-linux-gnu/libglog.a
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/x86_64-linux-gnu

Open python

>>> import graphvite
>>> import graphvite.application as gap

From command line:

graphvite baseline quick start

running baseline: demo/quick_start.yaml
downloading https://www.dropbox.com/s/cf21ouuzd563cqx/BlogCatalog-dataset.zip?dl=1 to BlogCatalog-dataset.zip
extracting BlogCatalog-dataset/data/edges.csv from BlogCatalog-dataset.zip to edges.csv
converting edges.csv to blogcatalog_graph.txt
splitting graph blogcatalog_graph.txt into blogcatalog_train.txt, blogcatalog_valid.txt, blogcatalog_test.txt
extracting BlogCatalog-dataset/data/group-edges.csv from BlogCatalog-dataset.zip to group-edges.csv
converting group-edges.csv to blogcatalog_label.txt
loading graph from /root/.graphvite/dataset/blogcatalog/blogcatalog_train.txt
0.00018755%
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
Graph<uint32>
------------------ Graph -------------------
#vertex: 10308, #edge: 327429
as undirected: yes, normalization: no
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
[time] GraphApplication.load: 0.0584445 s
[time] GraphApplication.build: 1.36667 s
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
GraphSolver<128, float32, uint32>
----------------- Resource -----------------
#worker: 1, #sampler: 7, #partition: 1
tied weights: no, episode size: 500
gpu memory limit: 23.5 GiB
gpu memory cost: 51.5 MiB
----------------- Sampling -----------------
augmentation step: 2, shuffle base: 2
random walk length: 40
random walk batch size: 100
#negative: 1, negative sample exponent: 0.75
----------------- Training -----------------
model: LINE
optimizer: SGD
learning rate: 0.025, lr schedule: linear
weight decay: 0.005
#epoch: 2000, batch size: 100000
resume: no
positive reuse: 1, negative weight: 5
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Batch id: 0 / 6548
loss = 0
Batch id: 1000 / 6548
loss = 0.387899
Batch id: 2000 / 6548
loss = 0.383534
Batch id: 3000 / 6548
loss = 0.37939
Batch id: 4000 / 6548
loss = 0.375876
Batch id: 5000 / 6548
loss = 0.372967
Batch id: 6000 / 6548
loss = 0.37103
[time] GraphApplication.train: 19.9742 s
------------- link prediction --------------
effective edges: 6646 / 6650
effective filter edges: 327429 / 327429
remaining edges: 6646 / 6646
AUC: 0.904191
[time] GraphApplication.evaluate: 26.4254 s
----------- node classification ------------
effective labels: 14472 / 14476
macro-F1@20%: 0.243524
micro-F1@20%: 0.392344
[time] GraphApplication.evaluate: 32.8746 s
save model to `line_blogcatalog.pkl`

I did not have to comment out any part of the code or change the RUNPATH. However, at this point, I only noted one issue, I cannot import faiss and graphvite at same time and I presume this conflict is due to libfaiss.so provided in both faiss-gpu and graphvite from their respective conda installations and depending on the order of import, the relevant dynamic file is linked:

Python 3.7.11 (default, Jul 27 2021, 14:32:16)
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import graphvite
>>> import faiss
Traceback (most recent call last):
  File "/netscratch/samin/dev/miniconda3/envs/snomed_kge/lib/python3.7/site-packages/faiss/swigfaiss_avx2.py", line 14, in swig_import_helper
    return importlib.import_module(mname)
  File "/netscratch/samin/dev/miniconda3/envs/snomed_kge/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 670, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 583, in module_from_spec
  File "<frozen importlib._bootstrap_external>", line 1043, in create_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
ImportError: /netscratch/samin/dev/miniconda3/envs/snomed_kge/lib/python3.7/site-packages/faiss/_swigfaiss.so: undefined symbol: _ZNK5faiss5Index6assignElPKfPll

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/netscratch/samin/dev/miniconda3/envs/snomed_kge/lib/python3.7/site-packages/faiss/loader.py", line 31, in <module>
    from .swigfaiss_avx2 import *
  File "/netscratch/samin/dev/miniconda3/envs/snomed_kge/lib/python3.7/site-packages/faiss/swigfaiss_avx2.py", line 17, in <module>
    _swigfaiss = swig_import_helper()
  File "/netscratch/samin/dev/miniconda3/envs/snomed_kge/lib/python3.7/site-packages/faiss/swigfaiss_avx2.py", line 16, in swig_import_helper
    return importlib.import_module('_swigfaiss')
  File "/netscratch/samin/dev/miniconda3/envs/snomed_kge/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
ModuleNotFoundError: No module named '_swigfaiss'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/netscratch/samin/dev/miniconda3/envs/snomed_kge/lib/python3.7/site-packages/faiss/swigfaiss.py", line 14, in swig_import_helper
    return importlib.import_module(mname)
  File "/netscratch/samin/dev/miniconda3/envs/snomed_kge/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 670, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 583, in module_from_spec
  File "<frozen importlib._bootstrap_external>", line 1043, in create_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
ImportError: /netscratch/samin/dev/miniconda3/envs/snomed_kge/lib/python3.7/site-packages/faiss/_swigfaiss.so: undefined symbol: _ZNK5faiss5Index6assignElPKfPll

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/netscratch/samin/dev/miniconda3/envs/snomed_kge/lib/python3.7/site-packages/faiss/__init__.py", line 17, in <module>
    from .loader import *
  File "/netscratch/samin/dev/miniconda3/envs/snomed_kge/lib/python3.7/site-packages/faiss/loader.py", line 39, in <module>
    from .swigfaiss import *
  File "/netscratch/samin/dev/miniconda3/envs/snomed_kge/lib/python3.7/site-packages/faiss/swigfaiss.py", line 17, in <module>
    _swigfaiss = swig_import_helper()
  File "/netscratch/samin/dev/miniconda3/envs/snomed_kge/lib/python3.7/site-packages/faiss/swigfaiss.py", line 16, in swig_import_helper
    return importlib.import_module('_swigfaiss')
  File "/netscratch/samin/dev/miniconda3/envs/snomed_kge/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
ModuleNotFoundError: No module named '_swigfaiss'
>>>

and other way around:

Python 3.7.11 (default, Jul 27 2021, 14:32:16)
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import faiss
>>> import graphvite
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/netscratch/samin/dev/miniconda3/envs/snomed_kge/lib/python3.7/site-packages/graphvite/__init__.py", line 36, in <module>
    lib = imp.load_dynamic("libgraphvite", lib_file)
  File "/netscratch/samin/dev/miniconda3/envs/snomed_kge/lib/python3.7/imp.py", line 342, in load_dynamic
    return _load(spec)
ImportError: /netscratch/samin/dev/miniconda3/envs/snomed_kge/lib/python3.7/site-packages/graphvite/lib/libgraphvite.so: undefined symbol: _ZN5faiss3gpu14GpuIndexFlatL2C1EPNS0_12GpuResourcesEiNS0_18GpuIndexFlatConfigE
>>>
gohjiayi commented 2 years ago

Hey @suamin, thanks for your help but unfortunately the solution did not work for me. I'm documenting some things that I have tried which others might be interested in.

Initially I was using Python 3.8 which I realised I had to downgrade to Python 3.7 in order to install graphvite, else you might face the error seen below.

(line) jiayi@cdas1:~/graphvite$ conda install -c milagraph graphvite
Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: - 
Found conflicts! Looking for incompatible packages.                                                                                                       failed                                                                                                                                                        

UnsatisfiableError: The following specifications were found
to be incompatible with the existing python installation in your environment:

Specifications:

  - graphvite -> python[version='>=2.7,<2.8.0a0|>=3.7,<3.8.0a0|>=3.6,<3.7.0a0|>=3.5,<3.6.0a0']

Your python: python=3.8

If python is on the left-most side of the chain, that's the version you've asked for.
When python appears to the right, that indicates that the thing on the left is somehow
not available for the python version you are constrained to. Note that conda will not
change your python version to a different minor version unless you explicitly specify
that.

The following specifications were found to be incompatible with your system:

  - feature:/linux-64::__glibc==2.27=0
  - feature:|@/linux-64::__glibc==2.27=0
  - graphvite -> libgcc-ng[version='>=5.4.0'] -> __glibc[version='>=2.17|>=2.17,<3.0.a0']

Your installed version is: 2.27

I was able to follow what @suamin have suggested above all the way until the python import for either faiss or graphvite as seen below.

>>> import graphvite
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/jiayi/anaconda3/envs/line/lib/python3.7/site-packages/graphvite/__init__.py", line 36, in <module>
    lib = imp.load_dynamic("libgraphvite", lib_file)
  File "/home/jiayi/anaconda3/envs/line/lib/python3.7/imp.py", line 342, in load_dynamic
    return _load(spec)
ImportError: /home/jiayi/anaconda3/envs/line/lib/python3.7/site-packages/graphvite/lib/libgraphvite.so: undefined symbol: _ZN6google10LogMessageC1EPKciiiMS0_FvvE

>>> import faiss
Traceback (most recent call last):
  File "/home/jiayi/anaconda3/envs/line/lib/python3.7/site-packages/faiss/swigfaiss_avx2.py", line 14, in swig_import_helper
    return importlib.import_module(mname)
  File "/home/jiayi/anaconda3/envs/line/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 670, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 583, in module_from_spec
  File "<frozen importlib._bootstrap_external>", line 1043, in create_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
ImportError: libcudart.so.10.0: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/jiayi/anaconda3/envs/line/lib/python3.7/site-packages/faiss/__init__.py", line 41, in <module>
    from .swigfaiss_avx2 import *
  File "/home/jiayi/anaconda3/envs/line/lib/python3.7/site-packages/faiss/swigfaiss_avx2.py", line 17, in <module>
    _swigfaiss_avx2 = swig_import_helper()
  File "/home/jiayi/anaconda3/envs/line/lib/python3.7/site-packages/faiss/swigfaiss_avx2.py", line 16, in swig_import_helper
    return importlib.import_module('_swigfaiss_avx2')
  File "/home/jiayi/anaconda3/envs/line/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
ModuleNotFoundError: No module named '_swigfaiss_avx2'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/jiayi/anaconda3/envs/line/lib/python3.7/site-packages/faiss/swigfaiss.py", line 14, in swig_import_helper
    return importlib.import_module(mname)
  File "/home/jiayi/anaconda3/envs/line/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 670, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 583, in module_from_spec
  File "<frozen importlib._bootstrap_external>", line 1043, in create_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
ImportError: libcudart.so.10.0: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/jiayi/anaconda3/envs/line/lib/python3.7/site-packages/faiss/__init__.py", line 49, in <module>
    from .swigfaiss import *
  File "/home/jiayi/anaconda3/envs/line/lib/python3.7/site-packages/faiss/swigfaiss.py", line 17, in <module>
    _swigfaiss = swig_import_helper()
  File "/home/jiayi/anaconda3/envs/line/lib/python3.7/site-packages/faiss/swigfaiss.py", line 16, in swig_import_helper
    return importlib.import_module('_swigfaiss')
  File "/home/jiayi/anaconda3/envs/line/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
ModuleNotFoundError: No module named '_swigfaiss'

I've tried checking the variables which allowed me to identify that some were unset, different from what others had as seen in this comment thread. However, no answers to how I could solve them for now, which is why I will be looking at other package alternatives.

(line) jiayi@cdas1:~/graphvite$ env |grep PATH
CMAKE_PREFIX_PATH=/home/jiayi/anaconda3/envs/line:/home/jiayi/anaconda3/envs/line/x86_64-conda-linux-gnu/sysroot/usr
CONDA_BACKUP_CMAKE_PREFIX_PATH=/home/jiayi/anaconda3/envs/line:/home/jiayi/anaconda3/envs/line/x86_64-conda-linux-gnu/sysroot/usr
PATH=/home/jiayi/anaconda3/envs/line/bin:/home/jiayi/.vscode-server/bin/899d46d82c4c95423fb7e10e68eba52050e30ba3/bin:/home/jiayi/.local/bin:/home/jiayi/anaconda3/bin:/home/jiayi/anaconda3/condabin:/home/jiayi/.vscode-server/bin/899d46d82c4c95423fb7e10e68eba52050e30ba3/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/usr/local/cuda/bin
(line) jiayi@cdas1:~/graphvite$ whereis libglog
libglog:
(line) jiayi@cdas1:~/graphvite$ echo $LD_LIBRARY_PATH
suamin commented 2 years ago

@gohjiayi I think there is some problem with your CUDA installation. Can you paste results for:

$ nvidia-smi
$ nvcc -V

Check for compatibility of your system. Try installing faiss with cudatoolkit 10.0. Also check this SO for CUDA issues. You can go through the list of conda faiss-gpu packages here. Try downgrading to 1.5.3.

gohjiayi commented 2 years ago

Thanks for your help @suamin. Output as seen below. I didn't look into my CUDA installation in particular. (PS: removed some information from my nvidia-smi as I have a few GPUs running)

(line) jiayi@cdas1:~$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243

(line) jiayi@cdas1:~$ nvidia-smi
Fri Dec 24 14:48:05 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 455.23.05    Driver Version: 455.23.05    CUDA Version: 11.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla P100-PCIE...  On   | 00000000:3B:00.0 Off |                    0 |
| N/A   32C    P0    25W / 250W |      2MiB / 16280MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    2   N/A  N/A      6832      C   ...ansact_figures/bin/python     1261MiB |
|    2   N/A  N/A     10166      C   ...hna/miniconda3/bin/python     3313MiB |
+-----------------------------------------------------------------------------+
suamin commented 2 years ago

Thanks. This seems fine, it could be worth trying:

$ sudo apt-get install libomp-dev # Ref: https://stackoverflow.com/a/65909488/16183953

but as @KiddoZhu mentioned above, GraphVite doesn't directly depend on OMP. This issue is mostly likely rooted in PyTorch and/or FAISS installation. Can you try and only import faiss and torch? If importing FAISS causes errors, it might also be worth trying to reinstall FAISS as stated in their installaion file (in order and import faiss after each):

$ conda install -c pytorch faiss-gpu cudatoolkit=10.2 # for CUDA 10.2
or
$ conda install -c conda-forge faiss-gpu
or
$ conda install -c pytorch/label/nightly faiss-gpu

edit: pls also share your conda environment (list of packages)