Open jmstone opened 1 year ago
In GitLab by @jfields7 on Nov 1, 2023, 15:23
Are any of the errors invalid memory accesses? If so, I wonder if this is related to the scratch memory issues that crept into Kokkos during the 4.1.0 release.
In GitLab by @jmstone216 on Nov 3, 2023, 09:18
The code runs without crashing, it is just that the L1 error in the test is too large and the test fails by the criteria we have set. Currently I have no idea why the error is not completely deterministic, and it certainly is worrying.
In GitLab by @jfields7 on Jun 13, 2024, 17:51
I ran into this issue while preparing !166 and spent some time looking at it. Here's a summary of what I've found so far:
master
and z4c-matter-rebase
, though changes in the latter are mostly independent of the standard AthenaK GRMHD solver.-fsanitize=address
and -O2g
instead of -O3
produces different results. Since -O3
shouldn't enable unsafe mathematical operations, the issue is probably not directly related to optimization flags. This may suggest a memory issue, but it's probably indirect because neither Kokkos's optional bounds checking or AddressSanitizer catch it.-fPIC
and -O3
provides results consistent with -O3
alone.My best guess at this point is that we're looking for a memory issue somewhere, but it's something very subtle.
In GitLab by @jmstone216 on Aug 16, 2023, 13:54
When merging the bump to Kokko4.1.0, it was noticed that the gr_monopole test in CI fails infrequently, and in unpredictable ways. Is this a race condition? Something else? This needs to be explored further, but does not seem to be fatal issue at the moment.
A perhaps related issue is that more CI tests are needed. We should add regression tests for bitwise compatibility, and all the new physics and features that have been added lately.