DrTimothyAldenDavis / SuiteSparse

The official SuiteSparse library: a suite of sparse matrix algorithms authored or co-authored by Tim Davis, Texas A&M University.
https://people.engr.tamu.edu/davis/suitesparse.html
Other
1.19k stars 266 forks source link

Issues regarding the new cmake-based build system for SuiteSparse 6.0.0 #149

Closed kiwifb closed 2 years ago

kiwifb commented 2 years ago

For starters I tried to build SuiteSparse_config individually with cmake. No matter what I do, a local installation is always enabled. Even if

When I run cmake for SuiteSparse_config I always get the lines -- Installation in ../lib and ../include, -- with 'make local ; make install'. No 'sudo' required.

Adding a line

message ( STATUS "value of INSIDE_SUITESPARSE: ${INSIDE_SUITESPARSE}" )

after the section setting INSIDE_SUITESPARSE in SuiteSparsePolicy.cmake reveals the problem. When you run cmake it displays

-- value of INSIDE_SUITESPARSE: (;(;EXISTS;/home/portage/sci-libs/suitesparseconfig-6.0.0_beta1/work/SuiteSparse-6.0.0-beta1/SuiteSparse_config/../lib;);AND;(;EXISTS;/home/portage/sci-libs/suitesparseconfig-6.0.0_beta1/work/SuiteSparse-6.0.0-beta1/SuiteSparse_config/../include;);)

which means that set ( INSIDE_SUITESPARSE ...) is certainly not behaving as the author intends it. It sets the variable to a string rather than a boolean expression. If I understand cmake's manual EXISTS statements should not be used outside of a if statement. A working as intended statement would probably be of the form

if ( ( ( EXISTS ${CMAKE_SOURCE_DIR}/../../lib     ) AND
            (   EXISTS ${CMAKE_SOURCE_DIR}/../../include ) ) )
  set ( INSIDE_SUITESPARSE TRUE )
endif
DrTimothyAldenDavis commented 2 years ago

Thanks for catching that! Good thing I called this a beta release. Yes, the INSIDE_SUITESPARSE variable should be false if ../lib and ../include don't exist. I'll fix this shortly and post a v6.0.0.beta.

DrTimothyAldenDavis commented 2 years ago

In looking over my entire build system, I think it would be better if the default install process be the same as the typical cmake install: "make ; make install" should only install in CMAKE_INSTALL_PREFIX. That's more conventional.

So I've revised my top-level Makefile, and changed my cmake variables.

cmake -DGLOBAL_INSTALL=x -DLOCAL_INSTALL=y ..

where x defaults to 1 and y defaults to 0. Then I'll have top-level shortcuts via a Makefile so the end user doesn't have to type in such a long command:

"make" does x=1, y=0 (global only, in CMAKE_INSTALL_PREFIX) "make global" same as "make" "make local" does x=0, y=1. Good if you don't have sudo priveldge. Installs in SuiteSparse/lib. "make both" does both

kiwifb commented 2 years ago

I would have thought you would have make default to make local, which I would think is your usual target. But as a linux packager, I will not complain.

DrTimothyAldenDavis commented 2 years ago

Well, I would like "make local" to be the default, but it isn't the cmake default, which is to install in CMAKE_INSTALL_PREFIX, and that defaults to /usr/local (on linux and Mac). So I should keep that as the default.

But it's also nice to add extra helper methods, for someone who doesn't have sudo priviledge, and who doesn't want to mess with CMAKE_INSTALL_PREFIX.

This should fix it:

https://github.com/DrTimothyAldenDavis/SuiteSparse/releases/tag/v6.0.0-beta2

Thanks again for taking a look at this beta1 release. Creating a whole new build system for SuiteSparse, in cmake, has been a huge effort and I'm using cmake features I haven't used before.

DrTimothyAldenDavis commented 2 years ago

Let me know if this works for you as linux package maintainer. I want to make your life as easy as possible, while at the same time making it easy for the non-sudo, non-cmake expert (I can tell them "just type 'make local ; make install' and then add your /home/me/mystuff/SuiteSparse/lib to your LD_LIBRARY_PATH, and use -I/home/me/mystuff/SuiteSparse/include, and then you don't have to pester your sysadmin for help installing all this stuff").

DrTimothyAldenDavis commented 2 years ago

I also want to make life easier for spack, brew, and so on.

kiwifb commented 2 years ago

I hear you. While I have an academic background my current day job is helping researchers with their computer problems at large. Do I need to say more.

kiwifb commented 2 years ago

It is behaving a lot better. I had a lot of issues originating from the value of INSIDE_SUITESPARSE in beta1 and now they are gone. Their are a number of things that would be nice but that I am expecting will come in time:

An example from the later can be found in netlib's lapack sources

install(FILES
  ${LAPACK_BINARY_DIR}/CMakeFiles/${LAPACKLIB}-config.cmake
  ${LAPACK_BINARY_DIR}/${LAPACKLIB}-config-version.cmake
  DESTINATION ${CMAKE_INSTALL_LIBDIR}/cmake/${LAPACKLIB}-${LAPACK_VERSION}
  COMPONENT Development
  )
DrTimothyAldenDavis commented 2 years ago

Those are great ideas. I can easily add the "install" option. For now, I could make it a part of "make install" for SuiteSparse_config. That would install all the files in SuiteSparse_config/cmake_modules in the destination (say /usr/local/lib/cmake/SuiteSparse). That is a simple change to the SuiteSparse_config/CMakeLists.txt

Each of the packages (except CSparse and GraphBLAS) relies on SuiteSparse_config, and GraphBLAS has its own copies of its own FindGraphBLAS.cmake, but the GraphBLAS/cmake_modules is just a subset of SuiteSparse_config/cmake_modules. That's mostly historical, since I developed GraphBLAS as a stand-alone package that didn't require the rest of SuiteSparse. I might change that in the future (GraphBLAS could rely on the dense BLAS).

For the disabling of the static library builds: Yes, I could add a flag, say NSTATIC, which defaults to false. If true, the static library would not be built. The default for most packages would be false, except for GraphBLAS (looong time to compile). For Mongoose, I'm not sure how to turn off the static library since the mongoose program relies on it.

I'll add these in a v6.0.0-beta3 shortly.

DrTimothyAldenDavis commented 2 years ago

Give this a try: https://github.com/DrTimothyAldenDavis/SuiteSparse/releases/tag/v6.0.0-beta3

kiwifb commented 2 years ago

I am impressed with the speed you turn those features in. Those are real improvements. But now we get to the nitty gritty. You should only ship the cmake modules that are used to find suitesparse packages, not ones that are meant for internal use to find stuff. They may have to be split in a separate folder.

FindGMP.cmake
FindMPFR.cmake
FindSuiteSparse_metis.cmake
SuiteSparseBLAS*.cmake
SuiteSparsePolicy.cmake
SuiteSparseReport.cmake

have no business in being shipped. For the record cmake ships with a FindBLAS.cmake and FindLAPACK.cmake modules (and yes they can find MKL). Migrating to them, rather using your own is a potential future job.

Now, the next bit, stuff like FindAMD.cmake for example should be found inside the AMD package and be installed by it. If you are doing a local install, the top makefile should set CMAKE_MODULE_PATH (https://cmake.org/cmake/help/latest/variable/CMAKE_MODULE_PATH.html) to ${src_path}/lib/cmake/SuiteSparse. That way cholmod can find the installed FindAMD.cmake when it needs it. And so on and so forth.

I am expecting that relocating and rewiring the files could take some time and you may not want to delay 6.0.0 for it.

DrTimothyAldenDavis commented 2 years ago

Thanks for the feedback. I'd rather get this right in v6.0.0 so I'll fix this before posting a stable v6.0.0. I'll take out those 6 from the install into lib/cmake/SuiteSparse.

I'm already using FindBLAS.cmake and FindLAPACK.cmake. Those are inside SuiteSparseBLAS.cmake. I use them by setting BLA_VENDOR and BLA_INTEGER_SIZE, first. That way I know which BLAS and which integer size is being used. The FindBLAS.cmake and FindLAPACK.cmake don't report that information back to the package using them, unfortunately.

kiwifb commented 2 years ago

I must say I didn't think about it but there are precedents about a shipping a collection of cmake files separate from their packages. So, as long as the 6 in questions are not installed, the rest should be OK.

DrTimothyAldenDavis commented 2 years ago

OK ... I already changed it to place each FindStuff.cmake in the Stuff/cmake_modules folder, for each package "Stuff". Then I added "install ( ... )" commands for both the global and local install (in CMAKE_INSTALL_PREFIX/lib/cmake/SuiteSparse for the 1st, and ./SuiteSparse/lib/cmake/SuiteSparse for the 2nd. It's all working now in my draft, yet to be pushed. I'll post 6.0.0-beta4 with this change shortly.

The only difference with your suggestion is that packages don't look in SuiteSparse/lib/cmake/SuiteSparse as I'm building the set of packages. Instead, I added ../AMD/cmake_modules to the module path in the UMFPACK CMakeLIsts.txt (for example), so that UMFPACK can use "find_package (AMD)". That way, I don't have to install each package before I build the next. I can do all the builds together then do all the installs. That fits nicely with my simple top-level Makefile. At SuiteSparse/ I can do (say):

make ; sudo make install

or for a purely-local build/install:

make local ; make install

and so on.

Regarding the FindSuiteSparse_metis.cmake. That library is needed if cholmod is used with my revised METIS 5.1.0. It differs slightly from the standard METIS 5.1.0, and SuiteSparse cannot use the standard libmetis.so.

So shouldn't I place FindSuiteSparse_metis.cmake where a user package can find it?

DrTimothyAldenDavis commented 2 years ago

Or perhaps I should add the libsuitesparse_metis.so to the list of CHOLOMD_LIBRARIES, and then FindCHOLMOD.cmake could look for libsuitesparse_metis, and if it finds it, it could add it to the list of libraries needed to use CHOLMOD.

I'm not intending on placing my SuiteSparse_metis.h in /usr/local/include or even the local ./SuiteSparse/include.

DrTimothyAldenDavis commented 2 years ago

OK, I think I got it to work. I won't place FindSuiteSparse_metis.cmake in /usr/local/lib/cmake/SuiteSparse.

Instead, FindCHOLMOD.cmake will append the libsuitesparse_metis.so libraries to its list of libraries (CHOLMOD_LIBRARIES), if it finds it. So there's no way to do "find_package ( SuiteSparse_metis )". It won't be necessary. CHOLMOD is the only thing that depends on it, and it will be found if needed.

kiwifb commented 2 years ago

That sounds good. I will do a larger scale testing of the next beta to see how all the packages work. That may take a bit longer, I will need to update some packaging code on my side. I only did suitesparse_config so far.

DrTimothyAldenDavis commented 2 years ago

Sounds good. I'd like to wait until you've had a chance to try out all of my next beta release, before I call it an official v6.0.0. No rush.

DrTimothyAldenDavis commented 2 years ago

Here it is: https://github.com/DrTimothyAldenDavis/SuiteSparse/releases/tag/v6.0.0-beta4

DrTimothyAldenDavis commented 2 years ago

See also the new NSTATIC option.

kiwifb commented 2 years ago

See also the new NSTATIC option.

I have already tested that in the last beta. Works well.

kiwifb commented 2 years ago

Question: amd historically has a fortran interface (and probably some other do but I am looking at amd right now). The code is still present but it you cannot enable it or build it with the provided CMakeLists.txt. Is it something that is just abandoned?

kiwifb commented 2 years ago

I am now looking at cholmod and after spending some time to understand the configuration logic, I see that the cuda and nocuda targets in https://github.com/DrTimothyAldenDavis/SuiteSparse/blob/master/CHOLMOD/Makefile are obsolete and should go.

kiwifb commented 2 years ago

Moved on umfpack. I believe the logic for including metis or not is broken:

-- Configuring done
CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
Please set them or make sure they are set and tested correctly in the CMake files:
/home/portage/sci-libs/umfpack-6.0.0_beta4/work/SuiteSparse-6.0.0-beta4/UMFPACK/SUITESPARSE_METIS_LIBRARY
    linked by target "umfpack" in directory /home/portage/sci-libs/umfpack-6.0.0_beta4/work/SuiteSparse-6.0.0-beta4/UMFPACK

-- Generating done
CMake Generate step failed.  Build files cannot be regenerated correctly.

I have installed cholmod without metis. By the way, is the modified metis absolutely necessary? I read what you said about it, does that means compilation or runtime will break if you use regular metis?

kiwifb commented 2 years ago

I should be more precise, the issue has to be in FindCholmod.cmake but I haven't really tracked things down any further today.

DrTimothyAldenDavis commented 2 years ago

Thanks for the feedback. I see the glitch in FindCHOLMOD.cmake when METIS is not being compiled. I'll fix that shortly.

For the Fortran codes in AMD: that is the only package that has any Fortran code. In my prior Makefile system, I had an option to build a libamdf77.a. I could add that to the cmake build system. I had overlooked it. I guess I should build libamdf77 like I did before?

cuda and nocuda in the top-level Makefiles: They do work. Those are shorthand for "cmake -DENABLE_CUDA=1 .." or 0 for nocuda. The ENABLE_CUDA is a cmake option to enable/disable the cmake build of the CUDA parts of CHOLMOD, SPQR, and GraphBLAS. I could remove them and let the installer do the cmake themselves ("cd build ; cmake -DENABLE_CUDA=0 .. ; make"). If the "make cuda" or "make nocuda" are confusing, I could delete them from the top-level Makefile. The SuiteSparse/Makefile is completely optional; it's just there for convenience for someone who's used to going into SuiteSparse and typing "make ; make install".

Regarding METIS: I've had significant problems linking to the unmodified metis. First, I need to know how METIS was compiled (the size of idx_t). It's easy to get the wrong size because the metis.h file and libmetis.so might not agree. METIS could be compiled with one size of idx_t but a calling program could # include the wrong metis.h. That's a segfault... The instructions for METIS invite the user to edit metis.h to edit the metis.h file to change idx_t. I've done that, but that means I have to compile METIS myself. If I then call the result "libmetis.so" it would badly break other installations of METIS (say in a Linux distro), as a library name collision. If I called it "libmetis64.so" then it might look like an official version of METIS for others to use. So I call my compiled library "libsuitesparse_metis.so". See: https://github.com/DrTimothyAldenDavis/SuiteSparse/blob/4da880591be7fb369b4df2f19fc49eab5000f6ed/SuiteSparse_metis/include/SuiteSparse_metis.h#L7 https://github.com/DrTimothyAldenDavis/SuiteSparse/blob/4da880591be7fb369b4df2f19fc49eab5000f6ed/SuiteSparse_metis/include/SuiteSparse_metis.h#L65 https://github.com/DrTimothyAldenDavis/SuiteSparse/blob/4da880591be7fb369b4df2f19fc49eab5000f6ed/SuiteSparse_metis/include/SuiteSparse_metis.h#L84 and note the change I've made to IDXTYPEWIDTH. I can't distribute that modified file as "metis.h" since that would cause confusion.

I also need control over malloc/calloc/realloc/free. I have many cases where SuiteSparse is used in applications that require me to use another set of those 4 functions (MATLAB, Python, Redis, ...). If I use the standard METIS, I can't do that, and I break some of the applications that use SuiteSparse.

I suppose one option would be to compile libcholmod.so with my modified metis embedded inside. I could even do that by placing the metis functions in a different name space, so they can't collide with a libmetis.so in a Linux distro, for example. Say there is a function METIS_some_function_here. Then I could do:

#define METIS_some_function_here CHOLMOD_metis_some_function_here
...
#include "metis_source_file_here.c"

That would be a bit tedious but perhaps safest. Then I can full control over my changes to METIS, but I don't expose any libsuitesparse_metis.so to the world.

DrTimothyAldenDavis commented 2 years ago

I've posted a beta5 that fixes the FindCHOLMOD problem when metis is not being installed.

I think I will take the plunge and embed the METIS functions I need inside libcholmod.so, renamed into a "SuiteSparse_*" namespace. This will solve several problems. It will avoid the need for end users to link with -lcholmod -lsuitesparse_metis (which is new, and thus awkward). It will avoid the problem of linking with the standard libmetis.so. I can also avoid including any file IO routines from METIS, and also the gk_regex which is a replacement for the POSIX regex when using Windows. That function can cause build problems on Windows (from past experience). But the regex replacement in GKlib is only needed for file I/O, which in turn is only needed by the METIS programs. I'm not compiling the METIS programs.

I will no longer have a SuiteSparse/SuiteSparse_metis folder. Instead, I will move that folder to be inside CHOLMOD. I won't compile it directly, but I will use the same method I do for the LZ4 and ZSTD compression methods used by GraphBLAS. That package allows malloc to be redefined but only at compile time, so I can't use liblz4.

So I do the #include trick. Say there is an lz4foo function. Then I do:

#define lz4foo GB_lz4foo
...
#include "lz4.c"

Where GB_* is my internal non-user-callable namespace inside GraphBLAS. This means I can have full control over the use of malloc inside lz4, while at the same time not colliding with a liblz4.so that may already exist (but with the "wrong" malloc).

My SuiteSparse_metis folder will still be a nearly identical, full copy of metis-5.1.0. I just won't use all the files and I won't have any new Makefiles or CMakeLists.txt placed in there. That will make it easy for me (or someone else) to pull out my copy of CHOLMOD/SuiteSparse_metis and plug in a new copy.

I'll do this before releasing SuiteSparse v6.0.0, so that libsuitesparse_metis.so will never exist (except ephemerally in my beta releases).

You'll still be able to compile SuiteSparse without METIS, if you like. But perhaps with this technique, you won't have to worry about it. The end user will just see libcholmod.so, whether or not that library includes METIS or not. No need to use -lmetis at all, but you still get METIS if libcholmod.so was compiled with it.

kiwifb commented 2 years ago

I guess I'll make a note that vendored metis is slightly forked and can't be replaced. We already have a few instances of that kind of stuff in scipy (firs example that comes to mind). I will conduct a survey later today to make sure all the needed .cmake files are shipped and that the files I asked to be excluded earlier do not end being needed at runtime. A good night sleep opens your eyes to those kinds of things.

kiwifb commented 2 years ago

I was scared for nothing, those files do not have to be shipped.

kiwifb commented 2 years ago

Still have the same trouble with FindCHOLMOD in beta5. But since you are refactoring metis, I'll wait the next iteration.

DrTimothyAldenDavis commented 2 years ago

libSuiteSparse_metis.so is gone, and the functions from METIS are now all embedded into libcholmod.so.

I forgot to do the pull request for beta5 before tagging that pre-release, so it didn't have my fixes to FindCHOLMOD. This beta6 version should work. Sorry for the confusion.

kiwifb commented 2 years ago

I am getting real stuck in cholmod right now, trying to figure the options and why it is not currently doing what I expect. But for example I am confused by

    if ( NOT DEFINED NCHOLESKY )
        set ( NCHOLESKY false )
    endif ( )
    if ( NOT EXISTS ${CMAKE_SOURCE_DIR}/Cholesky )
        set ( NCHOLESKY true )
    endif ( )

The if statement seems to say "if you do not find the source directory for 'Cholesky', set 'NCHOLESKY' to TRUE - otherwise leave it unchanged". I would have expected to set NCHOLESKY to FALSE if we cannot find it. I am not even sure in what scenario you would not have the folder. You would need to pro=actively remove it before running make or cmake.

DrTimothyAldenDavis commented 2 years ago

I can add some comments.

Sometimes the end user wants delete the parts they don't need (by deleting a folder, or equivalently, use -DNCHOLESKY=1 to not compile the Cholesky module). They might be using CHOLMOD within KLU for example, where they only need the Core and Partition modules to call METIS.

That's why the default is false: that means the Cholesky module should be used by default.

kiwifb commented 2 years ago

I can understand the tinkering to get what you want. My point in that particular bit is that I expected the code to look like

    if ( NOT EXISTS ${CMAKE_SOURCE_DIR}/Cholesky )
        set ( NCHOLESKY false )
    endif ( )

and I shouldn't have included the earlier bit, that was only to help locate it. If you remove the Cholesky folder, EXISTS returns FALSE, then NOT makes it TRUE. NCHOLESKY is currently set to true if the folder is missing. I would expect it to be false if the code has been removed.

DrTimothyAldenDavis commented 2 years ago

NCHOLESKY true means "do not compile with the Cholesky module" (N for negation, like NDEBUG for assert.h). So if the folder Cholesky has been deleted (or renamed), then I set NCHOLESKY to be true, which then disables the build of the Cholesky module. The code gets -DNCHOLESKY=1 and then even if compiled, all the codes in CHOLMOD/Cholesky get turned off, with

#ifndef NCHOLESKY
... all the code here ...
#endif

I've clarified it with comments and simplified the logic a bit to make it more readable: https://github.com/DrTimothyAldenDavis/SuiteSparse/blob/dev2/CHOLMOD/CMakeLists.txt

I'll do some more testing and call it beta8. Currently that change is only on the dev2 branch.

kiwifb commented 2 years ago

Right! Negative options. I have to redo all my packages with options for that flip.

kiwifb commented 2 years ago

I'll ask another probably annoying question: I am looking at klu but there may be more instances elsewhere. We have

find_package ( CHOLMOD 4.0.0 REQUIRED )

then why do we have section like

if ( CHOLMOD_FOUND )

If you marked CHOLMOD as REQUIRED cmake just stops at the configuration stage if cholmod is not found. So, CHOLMOD_FOUND is always true if you can make it this far. The alternative is that you didn't mean to make cholmod an absolute requirement.

DrTimothyAldenDavis commented 2 years ago

Oops. I think that was a mistake -- I think I added that "REQUIRED" just for testing and didn't mean for it to sneak into the release.

kiwifb commented 2 years ago

I am about done. I currently do not a nvidia gpu handy to do the cuda stuff. So, I have to skip GPUQREngine and SuiteSparse_GPURuntime and the gpu bits of SPQR. The only item on my wish list would be a NOPENMP option to turn it off if I wanted to. Generally a switch for optional stuff in various packages would be good (instead of what we call automagic in packaging circles) but I think we can live with the current state.

DrTimothyAldenDavis commented 2 years ago

Just posted beta8. Added NOPENMP would be pretty simple; I already support the building of the packages when it's not found. I can do a beta9.

kiwifb commented 2 years ago

It would be nice to have NOPENMP in. I otherwise have all my ducks in a row for my packages, and I can easily switch between betas now. Have to include the doc for amd, camd and klu now that you have merged that little fix.

DrTimothyAldenDavis commented 2 years ago

Here you go ... NOPENMP is in place: https://github.com/DrTimothyAldenDavis/SuiteSparse/releases/tag/v6.0.0-beta9 .

With a caveat about thread safety in GraphBLAS that is carefully documented.

kiwifb commented 2 years ago

All working. At the moment I am not doing GraphBLAS as part of suitesparse. I had a go at it as a separate thing and it worked. Since it doesn't need any of the other suitesparse element I am taking its packaging with suitesparse as a convenience.

I am happy with the state of things on my side.

DrTimothyAldenDavis commented 2 years ago

Great! Thanks again for all the valuable feedback.

Yes, GraphBLAS is part of SuiteSparse, in principle, but doesn't rely on any other package, and no other package relies on it. In the future, that will likely change. But for now, your observation is correct, and it can be a separate thing.