ralna / spral

Sparse Parallel Robust Algorithms Library
https://ralna.github.io/spral/
Other
104 stars 27 forks source link

Debug SSMFE C test on Windows #162

Closed jfowkes closed 10 months ago

jfowkes commented 10 months ago

Debugging #156

amontoison commented 10 months ago

ssmfe_c failed on Mac, maybe the associated log will help us to understand the issue on Windows.

amontoison commented 10 months ago
testing ssmfe_core...
testing standard_double...ok
testing generalized_double...ok
testing largest_double...ok
testing standard_double_complex...ok
testing generalized_double_complex...ok
testing largest_double_complex...ok
0 errors
testing ssmfe_expert...
testing standard_double...ok
testing generalized_double...ok
testing standard_shift_double...ok
testing generalized_shift_double...ok
testing buckling_double...ok
testing standard_double_complex...ok
testing generalized_double_complex...ok
testing standard_shift_double_complex...ok
testing generalized_shift_double_complex...ok
testing buckling_double_complex...ok
0 errors
testing ssmfe...
testing standard_double...ok
testing generalized_double...ok
testing standard_shift_double...ok
testing generalized_shift_double...ok
testing buckling_double...7 errors
testing standard_double_complex...ok
testing generalized_double_complex...ok
testing standard_shift_double_complex...ok
testing generalized_shift_double_complex...ok
testing buckling_double_complex...ok
7 errors
=============================
Total number of errors = 7
jfowkes commented 10 months ago

The Mac failure appears transitory and unrelated to the Windows issue.

jfowkes commented 10 months ago

Getting closer to locating the source of the error:

----------------------------------- stdout -----------------------------------
testing ssmfe_core...
testing standard_double...ok
testing generalized_double...ok
testing largest_double...ok
testing standard_double_complex...

So it is the test_core_z double complex test.

amontoison commented 10 months ago

@jfowkes I suggest to use tmate :)

amontoison commented 10 months ago

I didn't find the culprit Jari :(

jfowkes commented 10 months ago

Yeah it's a mystery, looks to me like we're just running out of stack space...

amontoison commented 10 months ago

Do we have someone that can run the tests on a Windows computer?

mjacobse commented 10 months ago

I think you are right about running out of stack space. Compiling this with MSYS2 on Windows gives me exactly the behaviour that you describe here. But adding -Wl,--stack,8388608 to the linker flags for the SSMFE C Interface test executable to increase the stack space to 8MB makes all tests run and pass normally.

Does not seem great to rely on that though, getting rid of the variable-length-arrays instead would seem nice. I am not quite sure if some of the casting to a flat pointer and back to a VLA that is done is even well-defined behavior... But I think moving to dynamic memory would necessarily mean that matrix indexing would have to be done manually (unless we move the code to C++ and use something like mdspan). I tried this here https://github.com/mjacobse/spral/commit/b61dc5a6c2348647fefad0d1c9e93f1eaf35e453 and it is not very pretty... But at least it does not use the stack for the matrices at all and so it also fixes the issue for me on Windows, without any extra linker flag to increase the stack size. I can create a pull request if you like this solution. If so, should the examples in examples/C/ssmfe that basically mirror parts of the test code be changed the same way?

amontoison commented 10 months ago

Thanks @mjacobse for testing on Windows! I like this solution. If Jari also likes the solution, we should do the same modifications with examples in examples/C/ssmfe.

amontoison commented 10 months ago

@jfowkes I updated this PR to just keep the fflush.

mjacobse commented 10 months ago

I just learned that this cast https://github.com/ralna/spral/blob/c5c1d0b460557663c245673f489c8268ae25b058/tests/ssmfe/laplace2d.h#L4-L8 from a pointer to what looks like a VLA is actually not quite that. Instead, this is called "variably-modified types" and completely fine to use even with just plain malloc'd data. So using these should allow us to use the heap but still use the same matrix indexing as before, which should make the proposed change much simpler.

As far as I understand these VMTs were mandatory in C99, made optional in C11 (along with VLAs), but will be required again in C23 (with VLAs staying optional). So I think this should be pretty well supported everywhere. I will work on a solution and create a pull request.

jfowkes commented 10 months ago

@mjacobse your suggestions looks perfect, please do create a pull request :)

jfowkes commented 10 months ago

Resolved in #171