Closed mmuetzel closed 6 months ago
This looks good to me, so I'm temporarily approving and initiating the CI runs. Before this is merged the fix should be mentioned in the CHANGELOG.md
and doc/shared/RecentChanges.rst
files.
Thank you for the review.
I added notes about that change to the two files you pointed out. Is the wording ok?
Thank you for the review.
I added notes about that change to the two files you pointed out. Is the wording ok?
Yes, the wording is OK. Thanks!
Looks like one of the tests failed for the MSVC runner in CI:
Start 80: test_profiling
80/80 Test #80: test_profiling ....................................***Failed 1.02 sec
Is there any way to get more detailed information from the runner to see what specifically failed?
@mmuetzel That test randomly fails some times if the VM is slow. I reran it, and it passed.
Could this cause problems with older versions of KLU (before 6.0.0) on Windows? I think SuiteSparse_long
was defined as long
before v6 unless _WIN64
was defined.
You are correct that prior to 6.0.0 SuiteSparse defined SuiteSparse_long
as long
unless it was on Windows (see https://github.com/DrTimothyAldenDavis/SuiteSparse/blob/v5.13.0/SuiteSparse_config/SuiteSparse_config.h). I think this could only cause a problem for compatibility with older SuiteSparse versions on platforms (other than Windows) where long
is a 32-bit type.
I think even on most Windows systems it would use long
because it depends on _WIN64
which it looks like is only defined for 64-bit ARM or x64
(see this page).
I interpret "64-bit ARM or x64" to include "x86-64" so most Windows systems would define _WIN64
. Perhaps this interpretation is incorrect?
If we want to ensure that other LLP64 and ILP32 systems can use SuiteSparse < 6 and 64-bit SUNDIALS_INDEX_SIZE together, then we can add a check of the SuiteSparse version in a preprocessor macro. I actually don't think it would be possible for an ILP32 system to run into the issue because they would probably not be using SUNDIALS_INDEX_SIZE=64, and I don't know of any non Windows LLP64 systems.
I interpret "64-bit ARM or x64" to include "x86-64" so most Windows systems would define
_WIN64
. Perhaps this interpretation is incorrect?
I'm not sure, the _WIN32
lists x86
and x64
so it wasn't clear the me what the right interpretation is.
If we want to ensure that other LLP64 and ILP32 systems can use SuiteSparse < 6 and 64-bit SUNDIALS_INDEX_SIZE together, then we can add a check of the SuiteSparse version in a preprocessor macro. I actually don't think it would be possible for an ILP32 system to run into the issue because they would probably not be using SUNDIALS_INDEX_SIZE=64, and I don't know of any non Windows LLP64 systems.
I think setting SUNDIALS_INDEX_SIZE
correctly might be the most common config/build error users run into. If we can, it would be good to give a warning that setup might be problematic.
You are right. Before SuiteSparse version 6.0.0, SuiteSparse_long
was long
apart from Windows 64-bit where it was __int64
(i.e., long long
).
That means it was 64-bit wide on (most relevant) platforms with 64-bit pointers and 32-bit wide on (most relevant) platforms with 32-bit pointers.
In newer versions of SuiteSparse SuiteSparse_long
is int64_t
on any platform.
SuiteSparse_long
is defined in old and in new versions of SuiteSparse. Maybe, KLU_INDEXTYPE
should be defined as SuiteSparse_long
?
However, converting between a pointer to a 32-bit integer array and a 64-bit integer array cannot be done (meaningfully) with a simple cast. I don't know what SUNSparseMatrix_IndexPointers
or SUNSparseMatrix_IndexValues
return. But if it is pointers to integer arrays of different types, it would probably be necessary to cast each element of the array to the different integer type instead of the entire array.
Maybe, those pointer casts should be removed entirely to allow the compiler to correctly detect if the pointers that are returned by these function don't match what sun_klu_refactor
, sun_klu_factor
or the other functions expect.
For easier reference, the code in question is, e.g., here: https://github.com/LLNL/sundials/blob/b685654c6958699821fc68a822b66076fec8cae7/src/sunlinsol/klu/sunlinsol_klu.c#L282-L284 https://github.com/LLNL/sundials/blob/b685654c6958699821fc68a822b66076fec8cae7/src/sunlinsol/klu/sunlinsol_klu.c#L296-L299
@gardner48 said:
I'm not sure, the _WIN32 lists x86 and x64 so it wasn't clear the me what the right interpretation is.
_WIN32
is defined in both cases. _WIN64
is defined only with 64-bit Windows (which is most systems these days).
@mmuetzel said:
Before SuiteSparse version 6.0.0, SuiteSparse_long was long apart from Windows 64-bit where it was __int64 (i.e., long long). That means it was 64-bit wide on (most relevant) platforms with 64-bit pointers and 32-bit wide on (most relevant) platforms with 32-bit pointers.
This is correct.
@gardner48 said:
I think setting SUNDIALS_INDEX_SIZE correctly might be the most common config/build error users run into. If we can, it would be good to give a warning that setup might be problematic.
I think that the combination of a 32-bit system, latest and greatest SUNDIALS, but a SuiteSparse that is 2 versions old (or more) is very unlikely, but for other TPLs we check that the index sizes used by the TPL and SUNDIALS match in CMake. We could do that and it would also catch the case where users have just misconfigured SUNDIALS or SuiteSparse.
@mmuetzel Would you be able to add a check in the file cmake/tpl/SundialsKLU.cmake
like we have for SuperLU_DIST and create a follow on PR to this one?
long int
is 32 bits wide on systems that use an LLP64 data model (e.g., Windows). Use a type for which the C standard guarantees that it is 64-bit instead (i.e.,int64_t
).See the error in CI for PR #432.