Closed ViralBShah closed 1 year ago
Caused by the tests for the thread attributes here: https://github.com/DrTimothyAldenDavis/SuiteSparse/blob/75cb8b4be44227880388454dba022bc7442de61e/SuiteSparse_config/cmake_modules/SuiteSparse__thread.cmake#L46 The CMake command check_c_source_runs
will both compile and run the code, so that is what is failing here.
There is an alternate check_c_source_compiles()
(https://cmake.org/cmake/help/latest/module/CheckCSourceCompiles.html) CMake macro that will just check to see if the code can compile, and not run it (so that only tests the compiler), and based on the short snippets being used in the test, I think it would be safe to do the compile-only test instead of the full run test.
I have several of the check_c_source_runs: three in the SuiteSparse__thread.cmake
file, and another here:
That one for getenv does require "runs" not "compiles" since I must test not only the existence of getenv, but that it also recognizes the "HOME" string. I cannot just test that the code compiles. This may be a problem for cross-compilation, however.
I'd like to keep the check_c_source_runs
in both files. Is there a way in cmake to determine if cross-compilation is in use?
Is there a way to check if getenv("HOME")
will both compile and run on a cross-compilation scenario? Perhaps I could add a cmake option that skips these tests entirely and just asserts their results. Then you could use that to avoid any check_c_source_*
entirely.
Pretty sure that nothing can run in a cross compilation scenario, but I am quite sure that we have getenv
on all our platforms (including windows).
If you can set up avoiding these checks on a branch of 7.2, we can test that particular commit and report back.
Getenv works on windows but I think "HOME" doesn't work. I use "LOCALAPPDATA" (or something like that) on windows.
I'd like to keep the check_c_source_runs in both files. Is there a way in cmake to determine if cross-compilation is in use?
CMake has the variable CMAKE_CROSSCOMPILING
that will be set when CMake detects it is cross compiling the code, so that can be used to detect if the executables can be run. Right now, I think only the thread checks are blocking stuff in Yggdrasil for Julia version updates. Is the getenv
check only used in the GraphBlas source right now?
I ran some experiments on our compilation system, and the compilation passes on our platform by replacing the check_c_source_runs
with check_c_source_compiles
in the thread checks (with one other minor patch I'll make a PR for), although we apparently do see the bug described in https://sourceforge.net/p/mingw-w64/bugs/327/, where the mingw build for Windows actually reports that both the __thread
and __declspec(thread)
syntax are supported in the configure check when it actually doesn't support the declspec version because mingw doesn't error on the latter and only gives a warning it isn't supported. But luckily, the __thread
version is a higher priority in the code, so that doesn't actually break anything.
If you don't want to change to compile-only checks, another option that might work for us is a flag to force the use of the C11 standard and not do any checks. This would guarantee the compiler would have the _Thread_local
attribute and remove the need for the configure-time check for us. GCC 4.9+ and Clang 3.3+ have _Thread_local
, so we can use versions after those on all our platforms to get the C11 support.
The flag to force C11 would need to do the following:
set( CMAKE_C_STANDARD 11 )
set( CMAKE_C_STANDARD_REQUIRED ON )
to force C11 mode on all targets that included the threads module. If this method is more palatable, I can put together a patch for it and do a test in Yggdrasil before making a PR for it.
GCC 4.9+ and Clang 3.3+ have _Thread_local
Except apparently Apple Clang doesn't have _Thread_local
... so that won't work for us actually. it does have __thread
, so that's at least something. I'm running a test to see if we could just force __thread
for all our platforms instead.
Ok, my compilation tests completed, and all platforms compiled with no errors when I force HAVE_KEYWORD__THREAD
to be on in the CMake config. So from the Julia/Yggdrasil perspective, the ability to force the use of the __thread
keyword will allow us to cross compile.
See, e.g., here for how that is handled for cross-building SuiteSparse for Octave for Windows: https://hg.octave.org/mxe-octave/file/tip/src/suitesparse.mk#l36
Essentially, add -DHAVE_GETENV_HOME=ON
to the configuration flags.
Regarding the __thread test: IIUC, that is only run if OpenMP wasn't found or was deactivated. Afaict, it is highly recommended to build GraphBLAS with OpenMP.
I can certainly give a try for the threads issue in cholmod, and replace try-run with try-compile. I put that in myself, not because of the changes in the cmake build system.
Still unsure about try-run vs try-compile for the getenv issue in GraphBLAS, because it is runtime behavior that I need to test. An option to set it on/off/auto (where auto means try run) would seem to be the best solution.
I'm still mostly away from my keyboard though, recovering ok just slowly. As a result I haven't even been able to review PRs yet.
Still unsure about try-run vs try-compile for the getenv issue in GraphBLAS, because it is runtime behavior that I need to test. An option to set it on/off/auto (where auto means try run) would seem to be the best solution.
If this depends so much on the system that GraphBLAS runs on, shouldn't that be a runtime check instead of a build time check?
It looks like the only consequential thing that test does is adding the preprocessor definition NGETENV_HOME
if that test didn't execute correctly.
But I can find nowhere where that definition would be used. Can that test just be removed?
Edit: There is a runtime check with fallback here: https://github.com/DrTimothyAldenDavis/SuiteSparse/blob/75cb8b4be44227880388454dba022bc7442de61e/GraphBLAS/Source/GB_jitifyer.c#L257-L264 But not here: https://github.com/DrTimothyAldenDavis/SuiteSparse/blob/75cb8b4be44227880388454dba022bc7442de61e/GraphBLAS/CUDA/GB_jit_cache.cu#L84
Maybe the latter should also use the fallback (for Windows)?
See #430 which should avoid the issue with HAVE_GETENV_HOME
when cross-compiling (by removing that test).
Thanks for all the feedback, everyone. Slowly getting up to speed here ...
Regarding GraphBLAS:
The GraphBLAS/CUDA code is not yet in production, and it will need to use the same JIT cache as the latest CPU JIT. The CUDA kernels will not work on Windows anyway (ever) because I need to use the Rapids Memory Manager.
GraphBLAS assumes ANSI C11, and even requires it. getenv is in ANSI C11 but the input strings are implementation defined. So I must use a run time test to see if getenv("HOME") can be used. However, it does look like the cmake test for getenv("HOME") is not actually required since I also do the run-time test in GraphBLAS/GB_jitifyer.c. I guess I don't need the cmake test at compile time after all. I think I added the test in cmake, then after getting the JIT to work without the cmake test, I never went back to delete the cmake test. I think for a while I thought I needed to know the result from getenv("HOME") during compilation but it turns out that I don't. Sorry for the confusion. This was just code draft/rewrite/get-it-to-work/then-oops-I-had-a-leftover-bit-I-didn't-need. I guess
So it looks like I can entirely delete the GraphBLAS/cmake_modules/SuiteSparse_getenv.cmake file (and its use) entirely.
That's what #430 does, from @mmuetzel , except the CUDA/GB_jit_cache.cu file does not need to be fixed right now. The correct fix will be to merge the latest CPU JIT with the older CUDA JIT, so the CUDA JIT gets its cache folder from the GraphBLAS/GB_jitifyer.c functions, rather then getting the cache folder on its own.
Regarding CHOLMOD:
I can probably safely go with a compile time check. I added the check for __threads to fix a METIS bug, where parallel calls to CHOLMOD (then calling METIS) was hammering the system rand() function and getting horrible performance. There's a closed issue in SuiteSparse about that. See https://github.com/DrTimothyAldenDavis/SuiteSparse/issues/375 and https://github.com/DrTimothyAldenDavis/SuiteSparse/commit/f3ac3eafa6531569f03768af7def36d487c494db . That fix was done on Sept 4, just recently, so @Wimmerer wouldn't have caught it.
... as you can tell I back at the keyboard, but laying flat so my code updates / reviews will be slow.
I will try switching the check for __thread from "try run" to "try compile". However, I saw this document that @imciner2 referred to when fixing the ordering of the keywords: https://gcc.gnu.org/onlinedocs/gcc/Thread-Local.html
6.64 Thread-Local Storage Thread-local storage (TLS) is a mechanism by which variables are allocated such that there is one instance of the variable per extant thread. The runtime model GCC uses to implement this originates in the IA-64 processor-specific ABI, but has since been migrated to other processors as well. It requires significant support from the linker (ld), dynamic linker (ld.so), and system libraries (libc.so and libpthread.so), so it is not available everywhere.
The last sentence has me worried just a bit. Could a "try compile" for __thread
succeed, but then fail at run time because the libc.so or libpthread.so doesn't support it? If so, then "try run" would catch that and thus not use __thread
, but "try compile" would not work.
The safest thing, I think, would be to use "try compile" only when cross compiling, and stick with "try run" otherwise.
Regarding the use of the ANSI C11 _Thread_local
: this is not a good option to place before __thread
, unfortunately. I've never gotten _Thread_local to actually work, even when forcing ANSI C11 compliance. This is why I place it last in the list of methods for creating thread local storage. It's safe there because the "try run" in the SuiteSparse__thread.cmake script will safely detect if it's available or not. It just never gets used. From what I can tell, no compiler supports it.
try_compile
in CMake is a bit of a misnomer because it not only compiles the source. But it also tries to link an executable (or a static library). The linker should be failing if there are undefined symbols when trying to link an executable.
So, try_compile
should be fine for the purpose here.
Fwiw, this is also true for check_c_source_compiles
:
https://cmake.org/cmake/help/latest/module/CheckCSourceCompiles.html
OK. I'll go with check_c_source_compiles throughout. I just pushed a change to the dev2 branch, with this change.
Can you try the dev2 branch to see if this fixes the cross compile issue?
Yes, the dev2
branch fixes the cross compile issue. Thank you for the quick fix. If it is possible to have a new release with these updates in the near future, it will be nice to ship Julia with an official released version of SuiteSparse.
I'm in the process of posting a v7.2.1.beta2 (see https://github.com/DrTimothyAldenDavis/SuiteSparse/pull/433).
v7.2.1.beta2 is now posted: https://github.com/DrTimothyAldenDavis/SuiteSparse/releases/tag/v7.2.1.beta2 .
Once I get some feedback from it from distro maintainers, I'll post it as the stable v7.2.1 release.
On 2nd thought, I should leave this issue open until v7.2.1 is posted as a stable release, in case I've missed something.
https://github.com/DrTimothyAldenDavis/SuiteSparse/releases/tag/v7.2.1 now posted as a stable release.
With the SuiteSparse 7.2 release, it seems that some compile time tests are being carried out, which were not an issue earlier. If possible, it would be great to avoid
try_run()
in Cmake so that SuiteSparse can be cross-compiled easily. In Julia's BinaryBuilder, we cross compile binaries for a large number of architectures on linux.https://github.com/JuliaPackaging/Yggdrasil/pull/7479