Closed mcmillan03 closed 1 year ago
I've seen another issue where the "FindGraphBLAS" properly locates the right libgraphblas.so and the right GraphBLAS.h include file, but when I looked at the built demo (tc_gpu_demo in this case, in the cudanew branch), the program had been linked against a different copy of libgraphblas.so (one that was actually on my LD_LIBRARY PATH, but it wasn't the one I wanted). The FindGraphBLAS.cmake differs slightly on this branch but perhaps it's a symptom of the same problem. When I deleted the other libgraphblas.so, the correct libgraphblas.so was found automatically.
When I define GRAPHBLAS_ROOT it does not work. But when I define GraphBLAS_ROOT it does. I think we should be using all caps but I am not sure why the latter works.
When I build with the latter and then define LD_LIBRARY_PATH to point to the wrong library, it does NOT cause the tests to fail.
Also, this aspect of the build configuration options should be in the README.md file
I modified the FindGraphBLAS.cmake
script to add the following two hints before the other hints:
HINTS ${GRAPHBLAS_ROOT}
HINTS $ENV{GRAPHBLAS_ROOT}
to both find_path
and find_library
calls and it worked for me. The former handles this invocation
cmake -DGRAPHBLAS_ROOT=/path/to/gb ../
and the latter handles this invocation
GRAPHBLAS_ROOT=/path/to/gb cmake ../
This should now be fixed on the fix_findGraphBLAS branch, which I just pushed to the dev branch.
Caveat: the only thing that doesn't seem to work is the LD_LIBRARY_PATH override. I have a personal folder, ~/bin, that I add to my LD_LIBRARY_PATH. If libgraphblas.so appears there, it gets linked in instead of the one listed in the GRAPHBLAS_ROOT. This happens even though cmake reports that the GraphBLAS library specified in GRAPHBLAS_ROOT is found. The same thing happens with both GraphBLAS_ROOT and GRAPHBLAS_ROOT.
Otherwise, the GRAPHBLAS_ROOT setting now takes precedence over any other library, say in LAGraph/../GraphBLAS/build for example.
For this example, both GraphBLAS libraries are v7.3.3, but the dates differ. I have a GxB method to query the date, and this tells me which library that gets linked in. The LAGraph test ./build/src/test_Init thus fails. I can also see the wrong library (from ~/bin) when I do ldd ./build/liblagraph.so.
This is on a Linux machine, with Ubuntu 20.04, gcc 9.4.0, and cmake 3.24.2.
output of cmake (now printing both GraphBLAS_ROOT and GRAPHBLAS_ROOT, on the fix_GraphBLAS branch). So far so good:
...
-- GraphBLAS_ROOT: /home/faculty/d/davis/dev2/SuiteSparse/GraphBLAS/
-- GRAPHBLAS_ROOT:
-- Found GraphBLAS: /home/faculty/d/davis/dev2/SuiteSparse/GraphBLAS/build/libgraphblas.so.7.3.3 (found suitable version "7.3.3", minimum required is "7.0.1")
-- GraphBLAS version: 7.3.3
-- GraphBLAS include: /home/faculty/d/davis/dev2/SuiteSparse/GraphBLAS/Include
-- GraphBLAS library: /home/faculty/d/davis/dev2/SuiteSparse/GraphBLAS/build/libgraphblas.so.7.3.3
-- GraphBLAS static: /home/faculty/d/davis/dev2/SuiteSparse/GraphBLAS/build/libgraphblas.so
...
output of ./build/src/test_Init
Test Init...
library: SuiteSparse:GraphBLAS 7.3.3 (Dec 1, 2022)
include: SuiteSparse:GraphBLAS 7.3.3 (Dec 9, 2022)
[ FAILED ]
test_Init.c:51: Check strcmp (date, "Dec 9, 2022") == 0... failed
GraphBLAS compiled with: GNU gcc 9.4.0 v9.4.0
LAGraph version 1.0.1 (Oct 28, 2022) from LAGraph.h
LAGraph version 1.0.1 (Oct 28, 2022) from LAGraph_Version
FAILED: 1 of 1 unit tests has failed.
output of ldd build/liblagraph.so:
linux-vdso.so.1 (0x00007ffd7a55d000)
libgraphblas.so.7 => /home/faculty/d/davis/bin/libgraphblas.so.7 (0x00007fce0781c000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fce076af000)
libgomp.so.1 => /lib/x86_64-linux-gnu/libgomp.so.1 (0x00007fce0766b000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fce07479000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fce07456000)
/lib64/ld-linux-x86-64.so.2 (0x00007fce11cc6000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fce07450000)
Note that my /home/faculty/d/davis/bin
folder is at the end of my LD_LIBRARY_PATH, and this folder contains the Dec 1st version of GraphBLAS v7.3.3.
According to the cmake description of how the environment variables are used, the PackageName_ROOT variable should take precedence over everything, including LD_LIBRARY_PATH. So I'm puzzled.
Regarding the case sensitive nature of the variable: the cmake documentation says that the order is:
(1) PackageName_ROOT (variable or env. variable). This would be GraphBLAS_ROOT. (2) ... (3) ... (4) the HINTS, with GRAPHBLAS_ROOT.
Here's what readelf reports, which has the correct runpath. But if /home/faculty/d/davis/bin/libgraphblas.so exists, it gets linked in instead of the copy in the dev2/SuiteSparse/GraphBLAS/build folder (from GraphBLAS_ROOT).
hypersparse $ readelf -d build/liblagraph.so
Dynamic section at offset 0x376d0 contains 29 entries:
Tag Type Name/Value
0x0000000000000001 (NEEDED) Shared library: [libgraphblas.so.7]
0x0000000000000001 (NEEDED) Shared library: [libm.so.6]
0x0000000000000001 (NEEDED) Shared library: [libgomp.so.1]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
0x000000000000000e (SONAME) Library soname: [liblagraph.so.1]
0x000000000000001d (RUNPATH) Library runpath: [/home/faculty/d/davis/v733/LAGraph/build:/home/faculty/d/davis/dev2/SuiteSparse/GraphBLAS/build:]
From "man ld" it looks like LD_LIBRARY_PATH takes precedence over the RUNPATH in the *.so file, unless you are running in "secure mode". That mode may be set at the system level. So perhaps you're not seeing the effect of LD_LIBRARY_PATH because your system is running in secure mode. From the man page:
...
o Using the environment variable LD_LIBRARY_PATH, unless the exe‐
cutable is being run in secure-execution mode (see below), in which
case this variable is ignored.
o Using the directories specified in the DT_RUNPATH dynamic section
attribute of the binary if present.
...
If I use LD_DEBUG=libs ; export LD_DEBUG
and then run ./build/src/test_Init
, I see the libgraphblas.so.7 being loaded from the path listed in the LD_LIBRARY_PATH. For other libraries, I see LD_LIBRARY_PATH being searched before the RUNPATH. The "secure mode" ignores the LD_LIBRARY_PATH entirely.
So it seems to me that LD_LIBRARY_PATH is supposed to override any RUNPATH in the liblagraph.so. You must not be seeing it because you're running in 'secure mode'.
I think this means the current FindGraphBLAS.cmake works fine. On my side, I use LD_LIBRARY_PATH to define libraries for MATLAB to find (I use this for libgraphblas_matlab.so). However, I can do that in a script that sets LD_LIBRARY_PATH just for MATLAB, so it doesn't interfere with the rest of my shell commands.
Scott and I just figured out what's going on. The behavior of ld has changed. There are to kinds of paths in an ELF binary: the RPATH and the RUNPATH. See https://akkadia.org/drepper/dsohowto.pdf .
The search order is: (1) RPATH (in the binary), (2) LD_LIBRARY_PATH, (3) RUNPATH (in the binary), (among other things).
The new behavior of ld is to set the RUNPATH, not RPATH, in the binary. My ld is set to the new behavior, and so for me, my LD_LIBRARY_PATH overrides my RUNPATH. As a result, CMake says it finds a specific libgraphblas.so, but when a test is run, it links against the version pointed to by LD_LIBRARY_PATH, which is different.
Scott's ld is using the old behavor, so it sets RPATH in the ELF binary, not RUNPATH. As a result, his ld finds the version of libgraphblas.so found by CMake, and set in the RPATH. It ignores the LD_LIBRARY_PATH.
To force the old behavior (setting RPATH not RUNPATH), this can be added to the LAGraph CMakeLists.txt, but this is Linux specific:
SET(CMAKE_EXE_LINKER_FLAGS "-Wl,--disable-new-dtags")
To force the new behavior of ld (setting RUNPATH not RPATH), which is also Linux-specific:
SET(CMAKE_EXE_LINKER_FLAGS "-Wl,--enable-new-dtags")
I have no idea how the Mac handles this.
The problem with this is that it's possible to link against, say, GraphBLAS v7.3.2 in LAGraph, but the #include file found was v7.3.3 (say), as found by FindGraphBLAS.cmake. Normally that's not a problem, since any v7.x should be compatible. However, as a test to ensure the right GraphBLAS library is found, the LAGraph/src/test/test_Init asserts that the GraphBLAS version found in the #include "GraphBLAS.h" is identical to the version found in the library itself, at run time (via GxB_Global_Option_get (GxB_LIBRARY_VERSION, ver)
).
With the new behavior (using RUNPATH, where LD_LIBRARY_PATH takes precedence), these versions don't match, so the test fails.
We won't try to fix this: the ld behavior will act as 'new' or 'old' depending on the user's system.
The first issue is a documentation issue:
The second issue is that when I follow the instructions in the top-level CMakeLists.txt I get the following error:
When I export the location as an environment variable (again referencing the softlink) I get the same error.
When I use specifically the 7.2.0 version of the directory I still get the same error.
The only way that seems to work is to remove all other versions of Suitesparse from that directory (I put them in a subdirectory called ‘old’).
I point out a third minor issue where the GraphBLAS root (red message below) is never printed out during cmake even when it is set as an environment variable: