flashlight / wav2letter

Facebook AI Research's Automatic Speech Recognition Toolkit
https://github.com/facebookresearch/wav2letter/wiki
Other
6.37k stars 1.01k forks source link

error while loading shared libraries: libnvrtc.so.10.0 #338

Closed massimobernava closed 5 years ago

massimobernava commented 5 years ago

Hi,

I am getting an error when I am trying to train tutorial model:

./Train: error while loading shared libraries: libnvrtc.so.10.0: cannot open shared object file: N│(base) massimo@Poseidone:~/wav2letter$ cd tutorials/
o such file or directory  

and:

>ldd ./Train
...
    libnvrtc.so.10.0 => not found 
    libcublas.so.10.0 => not found 
    libcufft.so.10.0 => not found 
    libcusolver.so.10.0 => not found 
    libcusparse.so.10.0 => not found   
...

Inside /usr/local/cuda-10.1/lib64 I have:

...

libnvrtc.so
libnvrtc.so.10.1
libnvrtc.so.10.1.168
libcufft.so
libcuftt.so.10
libcuftt.so.10.1.168

...

etc...

wav2letter ++ is not compatible with CUDA 10.1?

Thanks.

tlikhomanenko commented 5 years ago

Hi @massimobernava,

Could you check, that you have symlink /usr/local/cuda which points to the /usr/local/cuda-10.1?

massimobernava commented 5 years ago

Yes, I have:

...
lrwxrwxrwx  1 root root    9 giu 25 15:08 cuda -> cuda-10.1
...
tlikhomanenko commented 5 years ago

Hi @massimobernava,

What version of arrayfire do you have? Could you have a look on this https://github.com/facebookresearch/wav2letter/issues/335, maybe installing latest versions could help.

massimobernava commented 5 years ago

Hi @tlikhomanenko, thanks for your help. I have version 3.6.4 of ArrayFire and I installed it by taking the .sh file from the site. I checked the installation directory of ArrayFire after you message and there are the required library files. I added this path to LD_LIBRARY_PATH but now Train crash:

Could not read file '~ / wav2letter / tutorials / 1-librispeech_clean / network.arch'

SOLUTION:

with the full path (without ~) it works.

tlikhomanenko commented 5 years ago

Hi @massimobernava,

Yep, if you are running in a docker container you need to use full paths (will investigate why this happen, but right now don't know where is the problem with ~ paths).

nofreewill42 commented 3 years ago

I've encountered this error in pythorch with a new RTX 3090. (Replaced a 2080Ti with no other hardware changes.)

  1. Install newest NVIDIA driver (455.32.00); click here to select
  2. Install newest CUDA Toolkit (11.1.105); click here to select; (.deb with instructions on Ubuntu 18.04 went smoothly)
  3. Install newest cuDNN (8.0.5); click here for "lengthy" description. I recommend skimming through it, the instructions are somewhat clear. Tar file installation was smooth.
  4. Install pythorch-nightly: pip install -U --pre torch torchvision -f https://download.pytorch.org/whl/nightly/cu110/torch_nightly.html

After that, I tried to run my training script with no changes at all, but had the mentioned error. Error in dlopen or dlsym: libnvrtc.so.11.0: cannot open shared object file: No such file or directory

Then;

  1. I've modified the PATH and LD_LIBRARY_PATH in ~/.bashrc from /usr/local/cuda-10.1/... to my actual cuda version path /usr/local/cuda-11.1/... And I did
  2. ln -s /usr/local/cuda-11.1/targets/x86_64-linux/lib/libnvrtc.so.11.1 ~/miniconda2/lib/python3.6/site-packages/torch/lib/libnvrtc.so.11.0

After that, my system worked like a charm.