Closed FatemehTahavori closed 1 year ago
@FatemehTahavori this PR does not fix anything, I got a M1 mac to test with
Let me first start by stating what the issue is. The linker searches for libraries to load, but there is a constraint: all libraries loaded need to be for the same architecture as the running host program.
You'll notice here that that it says this julia binary is intel architecture. That means the following:
Next, I note that brew installs packages in arm mode. Finally, I note that upcoming julia 1.8 is with an (experimental) native arm compiled binary.
So now I can run a hypothesis. If what I stated above is true, then when I have arm libomp and arm lightgbm, it will fail to load with intel julia, and succeed to load with arm julia.
claire.watson@GBY0JFYDQ6Q6 local % DYLD_LIBRARY_PATH=/opt/homebrew/lib/ /Applications/Julia-1.8\ 2.app/Contents/Resources/julia/bin/julia -e "import Pkg; Pkg.status();import LightGBM"
Status `~/.julia/environments/v1.8/Project.toml`
[7acf609c] LightGBM v0.5.2
[ Info: lib_lightgbm found in system dirs!
claire.watson@GBY0JFYDQ6Q6 local %
claire.watson@GBY0JFYDQ6Q6 local % DYLD_LIBRARY_PATH=/opt/homebrew/lib/ /Applications/Julia-1.8.app/Contents/Resources/julia/bin/julia -e "import Pkg; Pkg.status();import LightGBM"
Status `~/.julia/environments/v1.8/Project.toml`
[7acf609c] LightGBM v0.5.2
[ Info: lib_lightgbm not found in system dirs, trying fallback
ERROR: InitError: LightGBM.LibraryNotFoundError("lib_lightgbm not found. Please ensure this library is either in system dirs or the dedicated paths: [\"/Users/claire.watson/.julia/packages/LightGBM/A7zVd/src\"]")
Stacktrace:
[1] find_library(library_name::String, custom_paths::Vector{String})
@ LightGBM ~/.julia/packages/LightGBM/A7zVd/src/LightGBM.jl:32
[2] __init__()
@ LightGBM ~/.julia/packages/LightGBM/A7zVd/src/LightGBM.jl:41
[3] _include_from_serialized(pkg::Base.PkgId, path::String, depmods::Vector{Any})
@ Base ./loading.jl:831
[4] _require_search_from_serialized(pkg::Base.PkgId, sourcepath::String, build_id::UInt64)
@ Base ./loading.jl:1039
[5] _require(pkg::Base.PkgId)
@ Base ./loading.jl:1315
[6] _require_prelocked(uuidkey::Base.PkgId)
@ Base ./loading.jl:1200
[7] macro expansion
@ ./loading.jl:1180 [inlined]
[8] macro expansion
@ ./lock.jl:223 [inlined]
[9] require(into::Module, mod::Symbol)
@ Base ./loading.jl:1144
during initialization of module LightGBM
claire.watson@GBY0JFYDQ6Q6 local %
The julia install at the 2 path (the first command) is the ARM one, and the julia install without the 2 in the path (the second command) is the Intel install. I also printed the LightGBM package information to show that this in fact works correctly with a released version of LightGBM.jl (i.e. without this change).
Now, to explain how I made this work
brew install libomp
brew install lightgbm
DYLD_LIBRARY_PATH=/opt/homebrew/lib
(which is where brew installs the libraries)In light of this I have to recommend this PR be closed, since it doesn't actually do anything or fix the issue (this PR can't possibly fix cross-architecture linking nor is it the place of this specific project top fix that problem)
@danielsoutar please see the above. I recommend the closure of this PR
Also @FatemehTahavori in light of the above it seems the advice given in #122 is incorrect; the correct advise should be to install julia 1.8 with ARM build on mac M1 and to either compile own lightgbm binary or use the brew binary and set DYLD_LIBRARY_PATH
to the correct locations.
@yaxxie we could reproduce the issue in docker:
docker run -d -it --platform=linux/x86_64 --name lgbm -v pwd
:/home/julia/ julia:1.6.3
cd ~ wget -O /tmp/lgbm.tar https://github.com/microsoft/LightGBM/archive/v3.2.0.tar.gz tar -xf /tmp/lgbm.tar -C /tmp/ export LIGHTGBM_EXAMPLES_PATH=/tmp/LightGBM-3.2.0
import Pkg Pkg.add("LightGBM") If you run tests they are failing Pkg.test("LightGBM")
(@v1.6) pkg> add LightGBM#Fix_lgbm_libNotFound tests are passing
[ Info: ["libcrypt"] not found in `DL_LOAD_PATH`, or system library paths, trying fallback
find_library finds system lib: Error During Test at /root/.julia/packages/LightGBM/fXI7r/test/basic/test_lightgbm.jl:77
Got exception outside of a @test
LightGBM.LibraryNotFoundError("[\"libcrypt\"] not found. Please check this library using Libdl.dlopen(l; throw_error=true) where l = joinpath(custom_paths, lib)")
Stacktrace:
[1] find_library(library_names::Vector{String}, custom_paths::Vector{String})
@ LightGBM ~/.julia/packages/LightGBM/fXI7r/src/LightGBM.jl:36
[2] macro expansion
@ ~/.julia/packages/LightGBM/fXI7r/test/basic/test_lightgbm.jl:84 [inlined]
[3] macro expansion
@ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Test/src/Test.jl:1151 [inlined]
[4] macro expansion
@ ~/.julia/packages/LightGBM/fXI7r/test/basic/test_lightgbm.jl:80 [inlined]
[5] macro expansion
@ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Test/src/Test.jl:1151 [inlined]
[6] top-level scope
@ ~/.julia/packages/LightGBM/fXI7r/test/basic/test_lightgbm.jl:44
[7] include(fname::String)
@ Base.MainInclude ./client.jl:444
[8] macro expansion
@ ~/.julia/packages/LightGBM/fXI7r/test/runtests.jl:84 [inlined]
[9] macro expansion
@ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Test/src/Test.jl:1151 [inlined]
[10] macro expansion
@ ~/.julia/packages/LightGBM/fXI7r/test/runtests.jl:84 [inlined]
[11] macro expansion
@ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Test/src/Test.jl:1151 [inlined]
[12] top-level scope
@ ~/.julia/packages/LightGBM/fXI7r/test/runtests.jl:59
[13] include(fname::String)
@ Base.MainInclude ./client.jl:444
[14] top-level scope
@ none:6
[15] eval
@ ./boot.jl:360 [inlined]
[16] exec_options(opts::Base.JLOptions)
@ Base ./client.jl:261
[17] _start()
@ Base ./client.jl:485
[ Info: ["lib_that_simply_doesnt_exist"] not found in `DL_LOAD_PATH`, or system library paths, trying fallback
Test Summary: | Pass Error Broken Total
Basic tests | 111 1 2 114
Estimator parameters | 20 2 22
Estimator parameters | 15 15
Utils | 5 5
Fit | 52 52
CV | 6 6
Search CV | 10 10
LightGBM | 3 1 4
find_library | 3 1 4
find_library works with no system lib | 1 1
find_library finds system lib first | 1 1
find_library finds system lib | 1 1
find_library returns empty and logs error | 1 1
ERROR: LoadError: Some tests did not pass: 111 passed, 0 failed, 1 errored, 2 broken.
in expression starting at /root/.julia/packages/LightGBM/fXI7r/test/runtests.jl:57
ERROR: Package LightGBM errored during testing
(@v1.6) pkg> st
Status `~/.julia/environments/v1.6/Project.toml`
[7acf609c] LightGBM v0.5.2 `https://github.com/IQVIA-ML/LightGBM.jl.git#Fix_lgbm_libNotFound`
(@v1.6) pkg>
still fails that particular test when I check it. Furthermore, I suspect there is something not quite right about that docker image because:
julia> import Libdl
julia> Libdl.find_library("libcrypt")
""
julia> Libdl.find_library("libpcprofile")
"libpcprofile"
julia>
and when I check the system linker paths (in that docker image) I find this:
root@5e2eaf1c5f7f:~# cat /etc/ld.so.conf.d/* | grep -v '#' | xargs find | grep libcrypt
find: '/usr/local/lib/x86_64-linux-gnu': No such file or directory
/lib/x86_64-linux-gnu/libcrypt-2.28.so
/lib/x86_64-linux-gnu/libcrypt.so.1
/usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
root@5e2eaf1c5f7f:~# cat /etc/ld.so.conf.d/* | grep -v '#' | xargs find | grep libpcprofile
find: '/usr/local/lib/x86_64-linux-gnu': No such file or directory
/lib/x86_64-linux-gnu/libpcprofile.so
You can see that for the libpcprofile
that was found, there was a file ending with .so
only -- for libcrypt
which was not found, there was no lib with .so
only at the end, .so.1
etc. This is normal, as libs can be versioned. But systems usually ship with short form symlinks to the longer names:
[yaxattax@fedora ~]$ find / 2>/dev/null| grep libcrypt.so$
/home/yaxattax/.local/share/containers/storage/overlay/4723f6643c4df3f617b017f7063d23aa50c7562bed4dc3074578ce896a385972/diff/usr/lib/x86_64-linux-gnu/libcrypt.so
/home/yaxattax/.local/share/containers/storage/overlay/1d5b529db9abb046582d40bfac011d80fc021907f3bfc31dda6870a52ee5e3da/diff/usr/lib/x86_64-linux-gnu/libcrypt.so
/usr/lib64/libcrypt.so
[yaxattax@fedora ~]$ ls -l /usr/lib64/libcrypt.so
lrwxrwxrwx. 1 root root 17 Feb 1 2022 /usr/lib64/libcrypt.so -> libcrypt.so.2.0.0
[yaxattax@fedora ~]$
and then if you launch julia on this machine and try to load libcrypt:
julia> import Libdl
julia> Libdl.find_library("libcrypt")
"libcrypt"
julia>
You can see it works.
As a bonus: inside the docker image, I draw your attention to this:
julia> Libdl.find_library("libcrypt-2.28")
"libcrypt-2.28"
julia>
and remind ourselves what we found when looking for libcrypt:
root@5e2eaf1c5f7f:~# cat /etc/ld.so.conf.d/* | grep -v '#' | xargs find | grep libcrypt
find: '/usr/local/lib/x86_64-linux-gnu': No such file or directory
/lib/x86_64-linux-gnu/libcrypt-2.28.so
/lib/x86_64-linux-gnu/libcrypt.so.1
/usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
root@5e2eaf1c5f7f:~#
notice this one: /lib/x86_64-linux-gnu/libcrypt-2.28.so
with a .so
at the end, and we found it if we asked for libcrypt-2.28
So once again, I put it to you that this PR fixes nothing.
Since
I think that this PR needs to be closed. Please close it @FatemehTahavori
@FatemehTahavori would you mind closing this PR? Unless I am mistaken, I don't believe we require it.
It would be good if you can explain how the fix works, or what was going wrong before that this changes.