Closed massimobernava closed 5 years ago
Hi @massimobernava,
Could you check, that you have symlink /usr/local/cuda
which points to the /usr/local/cuda-10.1
?
Yes, I have:
...
lrwxrwxrwx 1 root root 9 giu 25 15:08 cuda -> cuda-10.1
...
Hi @massimobernava,
What version of arrayfire do you have? Could you have a look on this https://github.com/facebookresearch/wav2letter/issues/335, maybe installing latest versions could help.
Hi @tlikhomanenko, thanks for your help. I have version 3.6.4 of ArrayFire and I installed it by taking the .sh file from the site. I checked the installation directory of ArrayFire after you message and there are the required library files. I added this path to LD_LIBRARY_PATH but now Train crash:
Could not read file '~ / wav2letter / tutorials / 1-librispeech_clean / network.arch'
SOLUTION:
with the full path (without ~) it works.
Hi @massimobernava,
Yep, if you are running in a docker container you need to use full paths (will investigate why this happen, but right now don't know where is the problem with ~ paths).
I've encountered this error in pythorch with a new RTX 3090. (Replaced a 2080Ti with no other hardware changes.)
pip install -U --pre torch torchvision -f https://download.pytorch.org/whl/nightly/cu110/torch_nightly.html
After that, I tried to run my training script with no changes at all, but had the mentioned error.
Error in dlopen or dlsym: libnvrtc.so.11.0: cannot open shared object file: No such file or directory
Then;
ln -s /usr/local/cuda-11.1/targets/x86_64-linux/lib/libnvrtc.so.11.1 ~/miniconda2/lib/python3.6/site-packages/torch/lib/libnvrtc.so.11.0
After that, my system worked like a charm.
Hi,
I am getting an error when I am trying to train tutorial model:
and:
Inside /usr/local/cuda-10.1/lib64 I have:
...
...
etc...
wav2letter ++ is not compatible with CUDA 10.1?
Thanks.