FluxML / Torch.jl

Sensible extensions for exposing torch in Julia.
Other
211 stars 14 forks source link

could not load library "libdoeye_caml" on windows #32

Open bionicinnovations opened 4 years ago

bionicinnovations commented 4 years ago

Torch.jl wont precompile on my Windows machine, I also updated CUDA to version 11.0:

julia> versioninfo()
Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)
Environment:
  JULIA_EDITOR = "C:\Users\jason\AppData\Local\atom\app-1.49.0\atom.exe"  -a
  JULIA_NUM_THREADS = 6

I get the following error:

julia> using Torch
[ Info: Precompiling Torch [6a2ea274-3061-11ea-0d63-ff850051a295]
ERROR: LoadError: LoadError: could not load library "libdoeye_caml"
The specified module could not be found. 
Stacktrace:
 [1] macro expansion at C:\Users\jason\.julia\packages\Torch\o9HpT\src\error.jl:12 [inlined]
 [2] at_grad_set_enabled(::Int64) at C:\Users\jason\.julia\packages\Torch\o9HpT\src\wrap\libdoeye_caml_generated.jl:70
 [3] top-level scope at C:\Users\jason\.julia\packages\Torch\o9HpT\src\tensor.jl:6
 [4] include(::Module, ::String) at .\Base.jl:377
 [5] include(::String) at C:\Users\jason\.julia\packages\Torch\o9HpT\src\Torch.jl:1
 [6] top-level scope at C:\Users\jason\.julia\packages\Torch\o9HpT\src\Torch.jl:26
 [7] include(::Module, ::String) at .\Base.jl:377
 [8] top-level scope at none:2
 [9] eval at .\boot.jl:331 [inlined]
 [10] eval(::Expr) at .\client.jl:449
 [11] top-level scope at .\none:3
in expression starting at C:\Users\jason\.julia\packages\Torch\o9HpT\src\tensor.jl:6
in expression starting at C:\Users\jason\.julia\packages\Torch\o9HpT\src\Torch.jl:26
ERROR: Failed to precompile Torch [6a2ea274-3061-11ea-0d63-ff850051a295] to C:\Users\jason\.julia\compiled\v1.4\Torch\2cR1S_GtrXI.ji.
Stacktrace:
 [1] error(::String) at .\error.jl:33
 [2] compilecache(::Base.PkgId, ::String) at .\loading.jl:1272
 [3] _require(::Base.PkgId) at .\loading.jl:1029
 [4] require(::Base.PkgId) at .\loading.jl:927
 [5] require(::Module, ::Symbol) at .\loading.jl:922

When I search my hard drive for "libdoeye_caml", I only find this one instance:

C:\Users\jason.julia\packages\Torch\o9HpT\src\wrap\libdoeye_caml_generated.jl:

How can I get this to work? What have I done wrong?

truedichotomy commented 4 years ago

I have the same issue on macOS.

otobrzo commented 4 years ago

I have the same on Ubuntu 20.04

jondeuce commented 4 years ago

Same issue on Ubuntu 18.04.4

DhairyaLGandhi commented 4 years ago

The torch version currently needs cuda 10.1 for torch 1.14 so maybe that is why.

PyTorch 1.15 can support newer cuda versions, so will need to update that

freddycct commented 4 years ago

@DhairyaLGandhi You mean Torch 1.4 and PyTorch 1.5? The latest Libtorch is 1.6.0 btw.

julia> using Torch
[ Info: Precompiling Torch [6a2ea274-3061-11ea-0d63-ff850051a295]
ERROR: LoadError: InitError: could not load library "/data/home/fchua/.julia/artifacts/d6ce2ca09ab00964151aaeae71179deb8f9800d1/lib/libdoeye_caml.so"
libcufft.so.10: cannot open shared object file: No such file or directory

Yet I have it here /usr/local/cuda-10.0/targets/x86_64-linux/lib/libcufft.so.10.0

DhairyaLGandhi commented 4 years ago

You mean Torch 1.4 and PyTorch 1.5? The latest Libtorch is 1.6.0 btw

Yes that's right

sidml commented 4 years ago

I get similar error on Ubuntu 20.04. The full error trace looks like this

┌ Info: Precompiling Torch [6a2ea274-3061-11ea-0d63-ff850051a295] └ @ Base loading.jl:1278 ERROR:LoadError: InitError:could not load library "/home/sid/.julia/artifacts/d6ce2ca09ab00964151aaeae71179deb8f9800d1/lib/libdoeye_caml.so" libcublas.so.10: cannot open shared object file: No such file or directory Stacktrace: [1] dlopen(::String, ::UInt32; throw_error::Bool) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Libdl/src/Libdl.jl:109 [2] dlopen at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Libdl/src/Libdl.jl:109 [inlined] (repeats2 times) [3] init() at /home/sid/.julia/packages/Torch_jll/sFQc0/src/wrappers/x86_64-linux-gnu-cxx11.jl:57 [4] _include_from_serialized(::String, ::Array{Any,1}) at ./loading.jl:697 [5] _require_search_from_serialized(::Base.PkgId, ::String) at ./loading.jl:782 [6] _require(::Base.PkgId) at ./loading.jl:1007 [7] require(::Base.PkgId) at ./loading.jl:928 [8] require(::Module, ::Symbol) at ./loading.jl:923 [9] include(::Function, ::Module, ::String) at ./Base.jl:380 [10] include(::Module, ::String) at ./Base.jl:368 [11] top-level scope at none:2 [12] eval at ./boot.jl:331 [inlined] [13] eval(::Expr) at ./client.jl:467 [14] top-level scope at ./none:3 during initialization of module Torch_jll in expression starting at /home/sid/.julia/packages/Torch/fIKJf/src/Torch.jl:3 Failed to precompile Torch [6a2ea274-3061-11ea-0d63-ff850051a295] to /home/sid/.julia/compiled/v1.5/Torch/2cR1S_j5y1b.ji. Stacktrace: [1] error(::String) at ./error.jl:33 [2] compilecache(::Base.PkgId, ::String) at ./loading.jl:1305 [3] _require(::Base.PkgId) at ./loading.jl:1030 [4] require(::Base.PkgId) at ./loading.jl:928 [5] require(::Module, ::Symbol) at ./loading.jl:923 [6] include_string(::Function, ::Module, ::String, ::String) at ./loading.jl:1091

DhairyaLGandhi commented 4 years ago

You need cuda installed in the system for that, which seems to be missing @sidml

jondeuce commented 4 years ago

How difficult is it to upgrade libtorch to v1.6? I would be happy to make a PR myself if someone could point me in the right direction.

DhairyaLGandhi commented 4 years ago

Shouldn't be too hard. We need to change the urls to the binaries with the correct cuda and cudnn here https://github.com/JuliaPackaging/Yggdrasil/blob/master/T/Torch/build_tarballs.jl

After that, hopefully, the wrappers would build easily.

DhairyaLGandhi commented 4 years ago

It's already present in the readme :)

NiklasGustafsson commented 3 years ago

This 13-month old blog post : https://fluxml.ai/blog/2020/06/29/acclerating-flux-torch.html mentions that Torch.jl assumes Linux. Is that still the case? I'm seeing this problem when I try Torch.jl on Windows.

ToucheSir commented 3 years ago

Yes. Windows support seems to be blocked on https://github.com/JuliaPackaging/Yggdrasil/pull/1529, so CMake expertise would be welcome.