Open behinger opened 1 year ago
Try running with the LD_DEBUG environment variable set and see if that helps
12180 1145047:
312181 1145047: file=libopenblas64_.so [0]; dynamically loaded by /lib/libvglfaker.so [0]
312182 1145047: find library=libopenblas64_.so [0]; searching
312183 1145047: search cache=/etc/ld.so.cache
312184 1145047: search path=/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/lib:/usr/lib (system search path)
312185 1145047: trying file=/lib/x86_64-linux-gnu/libopenblas64_.so
312186 1145047: trying file=/usr/lib/x86_64-linux-gnu/libopenblas64_.so
312187 1145047: trying file=/lib/libopenblas64_.so
312188 1145047: trying file=/usr/lib/libopenblas64_.so
312189 1145047:
312190 fatal: error thrown and no exception handler available.
312191 InitError(mod=:OpenBLAS_jll, error=ErrorException("could not load library "libopenblas64_.so"
312192 libopenblas64_.so: cannot open shared object file: No such file or directory"))
this is what I get immediately before
whereas this is what happens in julia 1.8.3
278925 1148742: file=/opt/julia-1.8.3/bin/../lib/julia/libopenblas64_.so [0]; dynamically loaded by /lib/libvglfaker.so [0]
278926 1148742: file=/opt/julia-1.8.3/bin/../lib/julia/libopenblas64_.so [0]; generating link map
278927 1148742: dynamic: 0x00007f33c795da80 base: 0x00007f33c5b85000 size: 0x0000000001e7a2a8
278928 1148742: entry: 0x00007f33c5cb5000 phdr: 0x00007f33c5b85040 phnum: 11
It looks like libvglfaker.so may be dynamically replacing dlopen
with a broken version. I am not sure we can do much about that. You might be able to get something mostly working with setting LD_LOAD_PATH.
ok, I added the lib/julia folder to LD_LIBRARY_PATH
(not LD_LOAD_PATH, probably mixup with JULIA_LOAD_PATH?) which fixed this and I can start julia1.9 :)
But I still wonder why this is necessary in julia1.9 but not julia 1.8.3
I experienced the same issues with pytorch when using vglrun. Everything worked fine if i didn't use vglrun. I was running on a conda environment and saw similar issue during debugging. I got something like this initially:
return torch.linalg.cholesky_ex(value).info.eq(0)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Error in dlopen: libtorch_cuda_linalg.so: cannot open shared object file: No such file or directory
I fixed it like this:
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/my/virtual/env/lib/python3.11/site-packages/torch/lib/
I couldn't find any other way around this.
When I try to run julia with
vglrun
(via virtualgl on a headless server, virtual displays with NoMachine) I get a crash only in Julia1.9rc1/2 - but not in Julia 1.8.3. This is on an ubuntu 22 installation. Without virtualgl everythin works as intended.It throws a:
could not load library "libopenblas64_.so"
- I dont know how to diagnose this further.versioninfo()