Open staticfloat opened 4 years ago
I fear that opening in the right order may not be sufficient.
This problem first popped out in SuiteSparse_jll
, whose latest releases are failing to load on Windows, with an error like the following:
ERROR: LoadError: InitError: could not load library "C:\Users\appveyor\.julia\artifacts\bc235f981126068329fbf3f9f58f57ae63827269\bin\libspqr.dll"
The specified procedure could not be found.
This likely happens because libspqr
depends on:
% objdump -p libspqr.dll|grep "DLL Name"
DLL Name: libgcc_s_seh-1.dll
DLL Name: KERNEL32.dll
DLL Name: msvcrt.dll
DLL Name: libopenblas64_.dll
DLL Name: libsuitesparseconfig.dll
DLL Name: libcholmod.dll
and the last two dependencies can't be automatically be found by the linker.
In these commits I moved opening of libspqr
at the end of the init
, but this is still failing. However, I'm also printing the value of dllist()
before trying to dlopen
the library:
dllist() = ["C:\\julia\\bin\\julia.exe", "C:\\windows\\SYSTEM32\\ntdll.dll", "C:\\windows\\system32\\KERNEL32.DLL", "C:\\windows\\system32\\KERNELBASE.dll", "C:\\julia\\bin\\libjulia.dll", "C:\\julia\\bin\\libgcc_s_seh-1.dll", "C:\\windows\\system32\\msvcrt.dll", "C:\\julia\\bin\\libssp-0.dll", "C:\\julia\\bin\\libstdc++-6.dll", "C:\\windows\\system32\\ADVAPI32.dll", "C:\\windows\\SYSTEM32\\dbghelp.dll", "C:\\windows\\SYSTEM32\\IPHLPAPI.DLL", "C:\\windows\\system32\\PSAPI.DLL", "C:\\windows\\SYSTEM32\\Secur32.dll", "C:\\windows\\system32\\USER32.dll", "C:\\windows\\SYSTEM32\\USERENV.dll", "C:\\windows\\SYSTEM32\\WINMM.dll", "C:\\windows\\system32\\WS2_32.dll", "C:\\julia\\bin\\LLVM.dll", "C:\\julia\\bin\\libwinpthread-1.dll", "C:\\windows\\SYSTEM32\\sechost.dll", "C:\\windows\\system32\\RPCRT4.dll", "C:\\windows\\system32\\NSI.dll", "C:\\windows\\SYSTEM32\\WINNSI.DLL", "C:\\windows\\system32\\GDI32.dll", "C:\\windows\\SYSTEM32\\profapi.dll", "C:\\windows\\SYSTEM32\\WINMMBASE.dll", "C:\\windows\\system32\\ole32.dll", "C:\\windows\\system32\\SHELL32.dll", "C:\\windows\\system32\\SspiCli.dll", "C:\\windows\\SYSTEM32\\cfgmgr32.dll", "C:\\windows\\SYSTEM32\\DEVOBJ.dll", "C:\\windows\\SYSTEM32\\combase.dll", "C:\\windows\\system32\\SHLWAPI.dll", "C:\\windows\\SYSTEM32\\CRYPTSP.dll", "C:\\windows\\system32\\rsaenh.dll", "C:\\windows\\SYSTEM32\\bcrypt.dll", "C:\\windows\\SYSTEM32\\CRYPTBASE.dll", "C:\\windows\\SYSTEM32\\bcryptPrimitives.dll", "C:\\windows\\system32\\IMM32.DLL", "C:\\windows\\system32\\MSCTF.dll", "C:\\windows\\SYSTEM32\\powrprof.dll", "C:\\windows\\system32\\uxtheme.dll", "C:\\windows\\system32\\mswsock.dll", "C:\\julia\\lib\\julia\\sys.dll", "C:\\julia\\bin\\libpcre2-8.DLL", "C:\\julia\\bin\\libgmp.DLL", "C:\\julia\\bin\\libmpfr.DLL", "C:\\julia\\bin\\libgmp-10.dll", "C:\\julia\\bin\\libopenblas64_.DLL", "C:\\julia\\bin\\libgfortran-4.dll", "C:\\julia\\bin\\libquadmath-0.dll", "C:\\julia\\bin\\libcholmod.DLL", "C:\\julia\\bin\\libcamd.dll", "C:\\julia\\bin\\libccolamd.dll", "C:\\julia\\bin\\libsuitesparseconfig.dll", "C:\\julia\\bin\\libcolamd.dll", "C:\\julia\\bin\\libamd.dll", "C:\\julia\\bin\\libsuitesparse_wrapper.DLL", "C:\\Users\\appveyor\\.julia\\artifacts\\3bc52a8ecc2836c9a93eb0a83425d2cb3871b08b\\bin\\libmetis.dll", "C:\\Users\\appveyor\\.julia\\artifacts\\87f297367bb2527a7dc3df599e3cb5ffd459a59f\\bin\\libopenblas64_.dll", "C:\\Users\\appveyor\\.julia\\artifacts\\bc235f981126068329fbf3f9f58f57ae63827269\\bin\\libklu.dll", "C:\\Users\\appveyor\\.julia\\artifacts\\bc235f981126068329fbf3f9f58f57ae63827269\\bin\\libbtf.dll", "C:\\Users\\appveyor\\.julia\\artifacts\\bc235f981126068329fbf3f9f58f57ae63827269\\bin\\libumfpack.dll", "C:\\Users\\appveyor\\.julia\\artifacts\\bc235f981126068329fbf3f9f58f57ae63827269\\bin\\libamd.dll", "C:\\Users\\appveyor\\.julia\\artifacts\\bc235f981126068329fbf3f9f58f57ae63827269\\bin\\libldl.dll", "C:\\Users\\appveyor\\.julia\\artifacts\\bc235f981126068329fbf3f9f58f57ae63827269\\bin\\libcolamd.dll", "C:\\Users\\appveyor\\.julia\\artifacts\\bc235f981126068329fbf3f9f58f57ae63827269\\bin\\libccolamd.dll", "C:\\Users\\appveyor\\.julia\\artifacts\\bc235f981126068329fbf3f9f58f57ae63827269\\bin\\libsuitesparseconfig.dll", "C:\\Users\\appveyor\\.julia\\artifacts\\bc235f981126068329fbf3f9f58f57ae63827269\\bin\\librbio.dll", "C:\\Users\\appveyor\\.julia\\artifacts\\bc235f981126068329fbf3f9f58f57ae63827269\\bin\\libcamd.dll", "C:\\Users\\appveyor\\.julia\\artifacts\\bc235f981126068329fbf3f9f58f57ae63827269\\bin\\libsuitesparse_wrapper.dll", "C:\\Users\\appveyor\\.julia\\artifacts\\bc235f981126068329fbf3f9f58f57ae63827269\\bin\\libcholmod.dll"]
All the needed libraries are already there :confused:
As far as I understand, the problem is that C:\julia\bin\libcholmod.DLL
is normally shadowing C:\Users\appveyor\.julia\artifacts\bc235f981126068329fbf3f9f58f57ae63827269\bin\libcholmod.dll
, unless we cd
to C:\Users\appveyor\.julia\artifacts\bc235f981126068329fbf3f9f58f57ae63827269\bin
to open the library, as pointed out on Slack by @KristofferC.
The latest libspqr.dll
in Yggdrasil, built with support for Metis, expects the following symbols in libcholmod.dll
:
% objdump -p libspqr-new.dll
[...]
DLL Name: libcholmod.dll
vma: Hint/Ord Member-Name Bound-To
34938 65 cholmod_l_allocate_dense
34954 67 cholmod_l_allocate_sparse
34970 69 cholmod_l_allocate_work
3498c 70 cholmod_l_amd
3499c 74 cholmod_l_analyze_p2
349b4 78 cholmod_l_calloc
349c8 91 cholmod_l_colamd
349dc 102 cholmod_l_dense_to_sparse
349f8 107 cholmod_l_error
34a0c 115 cholmod_l_free
34a20 116 cholmod_l_free_dense
34a38 117 cholmod_l_free_factor
34a50 118 cholmod_l_free_sparse
34a68 125 cholmod_l_malloc
34a7c 127 cholmod_l_metis
34a90 131 cholmod_l_nnz
34aa0 136 cholmod_l_postorder
34ab8 151 cholmod_l_realloc
34acc 155 cholmod_l_reallocate_sparse
34aec 177 cholmod_l_sparse_to_dense
34b08 180 cholmod_l_speye
34b1c 192 cholmod_l_transpose
[...]
For comparison, libspqr.dll
shipped with Julia and built without METIS support expects the following symbols:
% objdump -p libspqr-old.dll
[...]
DLL Name: libcholmod.dll
vma: Hint/Ord Member-Name Bound-To
34928 63 cholmod_l_allocate_dense
34944 65 cholmod_l_allocate_sparse
34960 67 cholmod_l_allocate_work
3497c 68 cholmod_l_amd
3498c 72 cholmod_l_analyze_p2
349a4 75 cholmod_l_calloc
349b8 88 cholmod_l_colamd
349cc 98 cholmod_l_dense_to_sparse
349e8 103 cholmod_l_error
349fc 111 cholmod_l_free
34a10 112 cholmod_l_free_dense
34a28 113 cholmod_l_free_factor
34a40 114 cholmod_l_free_sparse
34a58 121 cholmod_l_malloc
34a6c 124 cholmod_l_nnz
34a7c 129 cholmod_l_postorder
34a94 144 cholmod_l_realloc
34aa8 148 cholmod_l_reallocate_sparse
34ac8 170 cholmod_l_sparse_to_dense
34ae4 173 cholmod_l_speye
34af8 185 cholmod_l_transpose
[...]
We can check if the symbols expected by the latest libspqr.dll
are present in libcholmod.dll
shipped with Julia (with the METIS-related symbol as a potential culprit):
% for symbol in $(objdump -p libspqr-new.dll|grep cholmod_l|awk '{print $3}'); do nm libcholmod-old.dll|grep -w "${symbol}" || echo "---> ${symbol} not found"; done
000000006ba71130 T cholmod_l_allocate_dense
000000006ba76240 T cholmod_l_allocate_sparse
000000006ba6e8f0 T cholmod_l_allocate_work
000000006ba8b570 T cholmod_l_amd
000000006ba8cd90 T cholmod_l_analyze_p2
000000006ba75750 T cholmod_l_calloc
000000006ba8dc10 T cholmod_l_colamd
000000006ba72cc0 T cholmod_l_dense_to_sparse
000000006ba73840 T cholmod_l_error
000000006ba756e0 T cholmod_l_free
000000006ba71010 T cholmod_l_free_dense
000000006ba73a90 T cholmod_l_free_factor
000000006ba760d0 T cholmod_l_free_sparse
000000006ba755a0 T cholmod_l_malloc
---> cholmod_l_metis not found
000000006ba773b0 T cholmod_l_nnz
000000006ba8f460 T cholmod_l_postorder
000000006ba75890 T cholmod_l_realloc
000000006ba76540 T cholmod_l_reallocate_sparse
000000006ba722f0 T cholmod_l_sparse_to_dense
000000006ba766a0 T cholmod_l_speye
000000006ba7a5a0 T cholmod_l_transpose
So libcholmod.dll
shipped with Julia doesn't provide cholmod_l_metis
as expected, which also explains the error message "The specified procedure could not be found."
The issue is that in Julia base, we ship a SuiteSparse which builds Cholmod without metis support, in order to avoid an extra dependency. The solution proposed was to build SuiteSparse_jll with metis support and we could use that in the package ecosystem.
Clearly, these two are clashing. It would be much simpler to have one SuiteSparse - and just ship metis in base Julia until a point where we can move SuiteSparse out altogether.
My branch for https://github.com/JuliaLang/julia/issues/33973 will solve this particular issue.
For the record, a couple of days ago it was reported a case where order of dlopening seems to be important: https://discourse.julialang.org/t/http-get-crashes-julia-completely/43506/17.
We need to inspect all
LibraryProduct
s within a build, look at their dependencies, then do a breadth-first walk of those LibraryProducts whendlopen()
'ing them in the__init__()
method of JLL packages. This is important because on platforms such as Windows, where we don't have RPATHs, we may accidentally attempt to open a library that needs something else in its same directory, and it can't find it. :/Alternative solutions are to push that directory onto the
PATH
orcd()
to the lib directory beforedlopen()
ing (unsatisfiactory as some libraries may not be in that same directory) or to embed XML manifests into the.dll
's as a part of the BB audit process (Still not sure how to do this properly).