Exawind / nalu-wind

Solver for wind farm simulations targeting exascale computational platforms
https://nalu-wind.readthedocs.io
Other
122 stars 83 forks source link

TestTurbulence* test failures #1273

Closed marchdf closed 4 weeks ago

marchdf commented 1 month ago

Clang 15:

./unittestX --gtest_filter="TestTurbulence*"
   Nalu-Wind Version: v2.0.0-21-g53f49395
   Nalu-Wind GIT Commit SHA: 53f493954891937db39876452af46dbf3ce3811c-DIRTY
   Trilinos Version: 15.1.1

Note: Google Test filter = TestTurbulence*
[==========] Running 3 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 3 tests from TestTurbulenceAlgorithm
[ RUN      ] TestTurbulenceAlgorithm.turbviscsmagorinskyalgorithm
[       OK ] TestTurbulenceAlgorithm.turbviscsmagorinskyalgorithm (13 ms)
[ RUN      ] TestTurbulenceAlgorithm.turbviscwalealgorithm
[       OK ] TestTurbulenceAlgorithm.turbviscwalealgorithm (12 ms)
[ RUN      ] TestTurbulenceAlgorithm.computesstmaxlengthscaleelemalgorithm
[       OK ] TestTurbulenceAlgorithm.computesstmaxlengthscaleelemalgorithm (12 ms)
[----------] 3 tests from TestTurbulenceAlgorithm (37 ms total)

[----------] Global test environment tear-down
[==========] 3 tests from 1 test suite ran. (37 ms total)
[  PASSED  ] 3 tests.

GCC 12.3

❯ ./unittestX --gtest_filter="TestTurbulence*"
   Nalu-Wind Version: v2.0.0-21-g53f49395
   Nalu-Wind GIT Commit SHA: 53f493954891937db39876452af46dbf3ce3811c-DIRTY
   Trilinos Version: 15.1.1

Note: Google Test filter = TestTurbulence*
[==========] Running 3 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 3 tests from TestTurbulenceAlgorithm
[ RUN      ] TestTurbulenceAlgorithm.turbviscsmagorinskyalgorithm
[       OK ] TestTurbulenceAlgorithm.turbviscsmagorinskyalgorithm (33 ms)
[ RUN      ] TestTurbulenceAlgorithm.turbviscwalealgorithm
[       OK ] TestTurbulenceAlgorithm.turbviscwalealgorithm (27 ms)
[ RUN      ] TestTurbulenceAlgorithm.computesstmaxlengthscaleelemalgorithm
[       OK ] TestTurbulenceAlgorithm.computesstmaxlengthscaleelemalgorithm (27 ms)
[----------] 3 tests from TestTurbulenceAlgorithm (87 ms total)

[----------] Global test environment tear-down
[==========] 3 tests from 1 test suite ran. (89 ms total)
[  PASSED  ] 3 tests.

❯ ./unittestX --gtest_filter="TestTurbulence*"
   Nalu-Wind Version: v2.0.0-21-g53f49395
   Nalu-Wind GIT Commit SHA: 53f493954891937db39876452af46dbf3ce3811c-DIRTY
   Trilinos Version: 15.1.1

Note: Google Test filter = TestTurbulence*
[==========] Running 3 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 3 tests from TestTurbulenceAlgorithm
[ RUN      ] TestTurbulenceAlgorithm.turbviscsmagorinskyalgorithm
unknown file: Failure
C++ exception with description "Unsupported number of reference states" thrown in the test body.
[  FAILED  ] TestTurbulenceAlgorithm.turbviscsmagorinskyalgorithm (3 ms)
[ RUN      ] TestTurbulenceAlgorithm.turbviscwalealgorithm
unknown file: Failure
C++ exception with description "Unsupported number of reference states" thrown in the test body.
[  FAILED  ] TestTurbulenceAlgorithm.turbviscwalealgorithm (1 ms)
[ RUN      ] TestTurbulenceAlgorithm.computesstmaxlengthscaleelemalgorithm
unknown file: Failure
C++ exception with description "Unsupported number of reference states" thrown in the test body.
[  FAILED  ] TestTurbulenceAlgorithm.computesstmaxlengthscaleelemalgorithm (1 ms)
[----------] 3 tests from TestTurbulenceAlgorithm (5 ms total)

[----------] Global test environment tear-down
[==========] 3 tests from 1 test suite ran. (5 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 3 tests, listed below:
[  FAILED  ] TestTurbulenceAlgorithm.turbviscsmagorinskyalgorithm
[  FAILED  ] TestTurbulenceAlgorithm.turbviscwalealgorithm
[  FAILED  ] TestTurbulenceAlgorithm.computesstmaxlengthscaleelemalgorithm

 3 FAILED TESTS

GCC failure feels totally random. Clang never fails.

marchdf commented 1 month ago

GCC + Debug never seems to break of course. Otherwise it would be too easy.

marchdf commented 1 month ago

@psakievich, @tasmith4, @djglaze, @alanw0 I think has to do with the new way we do the field. Wondering if you had thoughts. Is there something in UnitTestAlgorithm that is wrong? Maybe a missing meta.commit or something like that?

djglaze commented 1 month ago

Interesting. With the random failures that only occur on a release build for one compiler, it sure does sound like a memory corruption bug of some sort. The throw is coming from a FieldRegistry::query() call that is complaining about a request for something other than a two-stated or three-stated Field. The simple_fields redesign of nalu-wind that went in several months ago should have absolutely nothing to do with Field states. The error message should probably be beefed up to include the number of states requested and the Field name. It's hard to guess the cause without more information, and I'm unable to build Nalu-Wind here to experiment with it.

What version of Trilinos are you guys using these days? The last time I was involved in the project, it was a version from February 2023. We've got a nice Field data access memory debugging tool in Trilinos versions after May 2023. It wouldn't surprise me at all if there was a preexisting Field memory error in Nalu-Wind somewhere. We found tons of errors in Sierra applications that were things like registering a Field as a scalar but then indexing into it as a vector. It wouldn't surprise me at all if there was a Field memory error in Nalu-Wind somewhere that is just getting bumped into now.

If you can run a new-enough Trilinos, you can turn it on by doing a Clang address-sanitizer build of both Trilinos and Nalu-Wind, adding the -DSTK_ASAN_FIELD_ACCESS flag to both. Afterward, just run the code and if there's a problem, it will error out with something like an "access-after-poison" error, and it will point you to the precise line of code that stepped out of bounds.

marchdf commented 1 month ago

Thanks for the thoughts. Nalu-Wind is at trilinos@15.1.1. Is that new enough for your suggestion?

djglaze commented 1 month ago

Yes, that version of Trilinos is great. It's probably a good idea for the Nalu-Wind developers to run this memory-debugging tool independent of this particular bug. It was put in place to catch memory errors that have gone un-noticed, but might start affecting code behavior when we roll out a new feature that enables variable-capacity Buckets.

This is something that we've been working on for more than a year that has the potential to significantly reduce memory usage (we've seen reductions of 50% for large, complex meshes) and also potentially improve performance by a percent or two. This is in Trilinos version 16.0.0, for when you upgrade next.

marchdf commented 1 month ago

Ok I think I got that set up alright. I am getting

[ RUN      ] TestTurbulenceAlgorithm.turbviscsmagorinskyalgorithm
=================================================================
==1705243==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7ffe6107c940 at pc 0x7fe1d9cccfd9 bp 0x7ffe6107c400 sp 0x7ffe6107c3f8
READ of size 1 at 0x7ffe6107c940 thread T0
    #0 0x7fe1d9cccfd8 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > YAML::Node::as<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >() const (/mnt/vdb/home/mhenryde/exawind/exawind-manager/stage/spack-stage-nalu-wind-master-fthu6sgmw6jbckd3h5narzf6rporfa22/spack-build-fthu6sg/libnalu.so+0xc86fd8)
    #1 0x7fe1d9fa0030 in sierra::nalu::operator>>(YAML::Node const&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, double, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, double>
 > >&) (/mnt/vdb/home/mhenryde/exawind/exawind-manager/stage/spack-stage-nalu-wind-master-fthu6sgmw6jbckd3h5narzf6rporfa22/spack-build-fthu6sg/libnalu.so+0xf5a030)
    #2 0x7fe1da2437b2 in sierra::nalu::SolutionOptions::load(YAML::Node const&) (/mnt/vdb/home/mhenryde/exawind/exawind-manager/stage/spack-stage-nalu-wind-master-fthu6sgmw6jbckd3h5narzf6rporfa22/spack-build-fthu6sg/libnalu.so+0x11fd7b2)
    #3 0xa3612d in unit_test_utils::NaluTest::create_realm(YAML::Node const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool) (/mnt/vdb/home/mhenryde/exawind/exawind-manager/stage/spack-stage-nalu-wind-master-fthu6sgmw6jbckd3h5narzf6rporfa22/spack-build-fthu6sg/unittestX+0xa3612d)
    #4 0xc32ca9 in TestTurbulenceAlgorithm_turbviscsmagorinskyalgorithm_Test::TestBody() (/mnt/vdb/home/mhenryde/exawind/exawind-manager/stage/spack-stage-nalu-wind-master-fthu6sgmw6jbckd3h5narzf6rporfa22/spack-build-fthu6sg/unittestX+0xc32ca9)
    #5 0x7fe1cb4bf665 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (/mnt/vdb/home/mhenryde/exawind/exawind-manager/spack/opt/spack/linux-rocky8-zen2/gcc-12.3.0/trilinos-15.1.1-zlacxoz6o7ckufv46jghdsbejib7oeyb/lib64/libgtest.so.15+0xc2665)
    #6 0x7fe1cb48d5da in testing::Test::Run() (/mnt/vdb/home/mhenryde/exawind/exawind-manager/spack/opt/spack/linux-rocky8-zen2/gcc-12.3.0/trilinos-15.1.1-zlacxoz6o7ckufv46jghdsbejib7oeyb/lib64/libgtest.so.15+0x905da)
    #7 0x7fe1cb48dec5 in testing::TestInfo::Run() (/mnt/vdb/home/mhenryde/exawind/exawind-manager/spack/opt/spack/linux-rocky8-zen2/gcc-12.3.0/trilinos-15.1.1-zlacxoz6o7ckufv46jghdsbejib7oeyb/lib64/libgtest.so.15+0x90ec5)
    #8 0x7fe1cb48e115 in testing::TestSuite::Run() (/mnt/vdb/home/mhenryde/exawind/exawind-manager/spack/opt/spack/linux-rocky8-zen2/gcc-12.3.0/trilinos-15.1.1-zlacxoz6o7ckufv46jghdsbejib7oeyb/lib64/libgtest.so.15+0x91115)
    #9 0x7fe1cb493467 in testing::internal::UnitTestImpl::RunAllTests() (/mnt/vdb/home/mhenryde/exawind/exawind-manager/spack/opt/spack/linux-rocky8-zen2/gcc-12.3.0/trilinos-15.1.1-zlacxoz6o7ckufv46jghdsbejib7oeyb/lib64/libgtest.so.15+0x96467)
    #10 0x7fe1cb48d83d in testing::UnitTest::Run() (/mnt/vdb/home/mhenryde/exawind/exawind-manager/spack/opt/spack/linux-rocky8-zen2/gcc-12.3.0/trilinos-15.1.1-zlacxoz6o7ckufv46jghdsbejib7oeyb/lib64/libgtest.so.15+0x9083d)
    #11 0x525a69 in main (/mnt/vdb/home/mhenryde/exawind/exawind-manager/stage/spack-stage-nalu-wind-master-fthu6sgmw6jbckd3h5narzf6rporfa22/spack-build-fthu6sg/unittestX+0x525a69)
    #12 0x7fe1c9c857e4 in __libc_start_main (/lib64/libc.so.6+0x3a7e4)
    #13 0x64fdcd in _start (/mnt/vdb/home/mhenryde/exawind/exawind-manager/stage/spack-stage-nalu-wind-master-fthu6sgmw6jbckd3h5narzf6rporfa22/spack-build-fthu6sg/unittestX+0x64fdcd)

I am going to try a debug build again with this but I am not holding my breath since I don't get the failures in debug builds.

marchdf commented 1 month ago

Debug build gives this:

==1778341==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7ffcc3c6b1b0 at pc 0x000000b061d1 bp 0x7ffcc3c6af60 sp 0x7ffcc3c6af58
READ of size 1 at 0x7ffcc3c6b1b0 thread T0
    #0 0xb061d0 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > YAML::Node::as<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >() const /mnt/vdb/home/mhenryde/exawind/exawind-manager/spack/opt/spack/linux-rocky8-zen2/gcc-12.3.0/yaml-cpp-0.7.0-wci75yixwfevwbp676yojgdnyte2lrw7/include/yaml-cpp/node/impl.h:154
    #1 0x7f097cc4091e in sierra::nalu::operator>>(YAML::Node const&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, double, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, double>
 > >&) /mnt/vdb/home/mhenryde/exawind/exawind-manager/environments/nalu-wind-dev/nalu-wind/src/NaluParsing.C:467
    #2 0x7f097cea5a3e in sierra::nalu::SolutionOptions::load(YAML::Node const&) /mnt/vdb/home/mhenryde/exawind/exawind-manager/environments/nalu-wind-dev/nalu-wind/src/SolutionOptions.C:300
    #3 0x9d1c85 in unit_test_utils::NaluTest::create_realm(YAML::Node const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool) /mnt/vdb/home/mhenryde/exawind/exawind-manager/environments/nalu-wind-dev/nalu-wind/unit_tests/UnitTestRealm.C:178
    #4 0xc795cd in TestAlgorithm::create_realm(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) /mnt/vdb/home/mhenryde/exawind/exawind-manager/environments/nalu-wind-dev/nalu-wind/unit_tests/algorithms/UnitTestAlgorithm.h:44
    #5 0xc6bf7a in TestTurbulenceAlgorithm_turbviscsmagorinskyalgorithm_Test::TestBody() /mnt/vdb/home/mhenryde/exawind/exawind-manager/environments/nalu-wind-dev/nalu-wind/unit_tests/algorithms/UnitTestLESAlgorithms.C:20
    #6 0x7f096d784665 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (/mnt/vdb/home/mhenryde/exawind/exawind-manager/spack/opt/spack/linux-rocky8-zen2/gcc-12.3.0/trilinos-15.1.1-zlacxoz6o7ckufv46jghdsbejib7oeyb/lib64/libgtest.so.15+0xc2665)
    #7 0x7f096d7525da in testing::Test::Run() (/mnt/vdb/home/mhenryde/exawind/exawind-manager/spack/opt/spack/linux-rocky8-zen2/gcc-12.3.0/trilinos-15.1.1-zlacxoz6o7ckufv46jghdsbejib7oeyb/lib64/libgtest.so.15+0x905da)
    #8 0x7f096d752ec5 in testing::TestInfo::Run() (/mnt/vdb/home/mhenryde/exawind/exawind-manager/spack/opt/spack/linux-rocky8-zen2/gcc-12.3.0/trilinos-15.1.1-zlacxoz6o7ckufv46jghdsbejib7oeyb/lib64/libgtest.so.15+0x90ec5)
    #9 0x7f096d753115 in testing::TestSuite::Run() (/mnt/vdb/home/mhenryde/exawind/exawind-manager/spack/opt/spack/linux-rocky8-zen2/gcc-12.3.0/trilinos-15.1.1-zlacxoz6o7ckufv46jghdsbejib7oeyb/lib64/libgtest.so.15+0x91115)
    #10 0x7f096d758467 in testing::internal::UnitTestImpl::RunAllTests() (/mnt/vdb/home/mhenryde/exawind/exawind-manager/spack/opt/spack/linux-rocky8-zen2/gcc-12.3.0/trilinos-15.1.1-zlacxoz6o7ckufv46jghdsbejib7oeyb/lib64/libgtest.so.15+0x96467)
    #11 0x7f096d75283d in testing::UnitTest::Run() (/mnt/vdb/home/mhenryde/exawind/exawind-manager/spack/opt/spack/linux-rocky8-zen2/gcc-12.3.0/trilinos-15.1.1-zlacxoz6o7ckufv46jghdsbejib7oeyb/lib64/libgtest.so.15+0x9083d)
    #12 0x6326c5 in RUN_ALL_TESTS() (/mnt/vdb/home/mhenryde/exawind/exawind-manager/stage/spack-stage-nalu-wind-master-2gfgvvihu2vkhggflhp3znw7zmwe6djg/spack-build-2gfgvvi/unittestX+0x6326c5)
    #13 0x63205f in main /mnt/vdb/home/mhenryde/exawind/exawind-manager/environments/nalu-wind-dev/nalu-wind/unit_tests.C:60
    #14 0x7f096bf4a7e4 in __libc_start_main (/lib64/libc.so.6+0x3a7e4)
    #15 0x631c9d in _start (/mnt/vdb/home/mhenryde/exawind/exawind-manager/stage/spack-stage-nalu-wind-master-2gfgvvihu2vkhggflhp3znw7zmwe6djg/spack-build-2gfgvvi/unittestX+0x631c9d)

I feel like this may be tangentially related to my issue...

marchdf commented 1 month ago

Turns out we expect yaml to fail like this. So this isn't my issue but it is preventing me from seeing my issue. I will find a way around it.

marchdf commented 1 month ago

Ok finally got it "working" but it's not showing anything. I augmented the error to be more verbose. This is what I am getting:

[==========] Running 3 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 3 tests from TestTurbulenceAlgorithm
[ RUN      ] TestTurbulenceAlgorithm.turbviscsmagorinskyalgorithm
unknown file: Failure
C++ exception with description "Unsupported number of reference states for field density with 18 states and 3 dims." thrown in the test body.
[  FAILED  ] TestTurbulenceAlgorithm.turbviscsmagorinskyalgorithm (3 ms)
[ RUN      ] TestTurbulenceAlgorithm.turbviscwalealgorithm
unknown file: Failure
C++ exception with description "Unsupported number of reference states for field density with 18 states and 3 dims." thrown in the test body.
[  FAILED  ] TestTurbulenceAlgorithm.turbviscwalealgorithm (1 ms)
[ RUN      ] TestTurbulenceAlgorithm.computesstmaxlengthscaleelemalgorithm
unknown file: Failure
C++ exception with description "Unsupported number of reference states for field density with 50 states and 3 dims." thrown in the test body.
[  FAILED  ] TestTurbulenceAlgorithm.computesstmaxlengthscaleelemalgorithm (1 ms)
[----------] 3 tests from TestTurbulenceAlgorithm (5 ms total)

Why would density have so many states?! It's like it took the number of total fields components and made density have that many states?! Mystery continues.

This is what I added to the code:

throw std::runtime_error("Unsupported number of reference states for field " + name + " with " + std::to_string(numStates) + " states and " + std::to_string(numDim) + " dims.");
djglaze commented 1 month ago

Wow, 18 and 50 states? I think I can state confidently that that is unintentional. :-) This is almost certainly fallout from a memory error somewhere. Either this is the result of using uninitialized memory where a bad value gets recycled, or there is some sort of buffer overrun and the real data is getting overwritten with garbage. Address sanitizer should have picked up access outside an allocation. I don't believe it can pick up uninitialized memory usage, though. You have to use the "memory sanitizer" feature, which requires that every dependent library (including the STL) is also compiled with it, to avoid an insane number of false-positives. It's a bit onerous to use.

Running a normal Debug or RelWithDebInfo build through the Valgrind tool should pick up uninitialized memory usage, as well as buffer over-runs. It's insanely slow, but this is just a unit test so it should be just fine. I'd recommend that as a next attempt.

Did this failure start recently? Can the code be bisected to point to a commit where the failure started? I suppose this might not be conclusive if the cause is somewhere else, and the "problematic" commit just happened to place important information in the memory location that is getting overwritten.

Attacking the code with a debugger might also be useful. If you have access to a decent debugger, you might be able to set a "watchpoint" on the problematic memory address early in the simulation, so that you can break when it gets overwritten. That might help track down where the problem is.

marchdf commented 1 month ago

Thanks for the tips. The very annoying thing is that typically using those tools make the error go away. It never happens with clang (either debug or not, though I should try a newer clang, I've tested with clang 15), and it doesnt happen with gcc + debug. I've only seen this in a non-debug gcc build. I can try RelWithDebInfo.

I am not sure when this started. I typically use clang and I hadn't seen it there. We only started seeing this with the new dashboard setup.

I will give some of this stuff a try.

marchdf commented 1 month ago

Ok RelWithDebInfo + valgrind:

 1 FAILED TEST
==609638==
==609638== HEAP SUMMARY:
==609638==     in use at exit: 852,051 bytes in 12,063 blocks
==609638==   total heap usage: 162,003 allocs, 149,940 frees, 1,788,869,951 bytes allocated
==609638==
==609638== LEAK SUMMARY:
==609638==    definitely lost: 0 bytes in 0 blocks
==609638==    indirectly lost: 0 bytes in 0 blocks
==609638==      possibly lost: 0 bytes in 0 blocks
==609638==    still reachable: 852,051 bytes in 12,063 blocks
==609638==         suppressed: 0 bytes in 0 blocks
==609638== Rerun with --leak-check=full to see details of leaked memory
==609638==
==609638== Use --track-origins=yes to see where uninitialised values come from
==609638== For lists of detected and suppressed errors, rerun with: -s
==609638== ERROR SUMMARY: 82 errors from 6 contexts (suppressed: 0 from 0)

with 4 of these kinds:

==609638== Conditional jump or move depends on uninitialised value(s)
==609638==    at 0x525E50: sierra::nalu::FieldRegistry::query(int, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (FieldRegistry.h:49)
==609638==    by 0x4EB1118: sierra::nalu::FieldManager::register_field(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<stk::mesh::Part*, std::allocator<stk::mesh::Part*> > const&, int, int, void const*) const (FieldManager.C:42)
==609638==    by 0x629DD6: TestTurbulenceAlgorithm::declare_fields() (UnitTestAlgorithm.C:85)
==609638==    by 0x62890A: TestAlgorithm::fill_mesh(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (UnitTestAlgorithm.C:22)
==609638==    by 0x628AF4: TestTurbulenceAlgorithm::fill_mesh_and_init_fields(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (UnitTestAlgorithm.C:100)
==609638==    by 0x62A9CE: TestTurbulenceAlgorithm_turbviscsmagorinskyalgorithm_Test::TestBody() (UnitTestLESAlgorithms.C:22)
==609638==    by 0xB3DA806: HandleSehExceptionsInMethodIfSupported<testing::Test, void> (gtest-all.cc:3914)
==609638==    by 0xB3DA806: void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (gtest-all.cc:3950)
==609638==    by 0xB3CB695: Run (gtest-all.cc:3989)
==609638==    by 0xB3CB695: testing::Test::Run() (gtest-all.cc:3979)
==609638==    by 0xB3CB9A4: Run (gtest-all.cc:4165)
==609638==    by 0xB3CB9A4: testing::TestInfo::Run() (gtest-all.cc:4138)
==609638==    by 0xB3CBA93: Run (gtest-all.cc:4297)
==609638==    by 0xB3CBA93: testing::TestSuite::Run() (gtest-all.cc:4276)
==609638==    by 0xB3D0482: testing::internal::UnitTestImpl::RunAllTests() (gtest-all.cc:6819)
==609638==    by 0xB3DAD16: HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (gtest-all.cc:3914)
==609638==    by 0xB3DAD16: bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) (gtest-all.cc:3950)
==609638==

and then 2 of these:

==609638== Use of uninitialised value of size 8
==609638==    at 0x52540F: __to_chars_10_impl<unsigned int> (charconv.h:96)
==609638==    by 0x52540F: std::__cxx11::to_string(int) (basic_string.h:4029)
==609638==    by 0x5262FE: sierra::nalu::FieldRegistry::query(int, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (FieldRegistry.h:59)
==609638==    by 0x4EB1118: sierra::nalu::FieldManager::register_field(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<stk::mesh::Part*, std::allocator<stk::mesh::Part*> > const&, int, int, void const*) const (FieldManager.C:42)
==609638==    by 0x629DD6: TestTurbulenceAlgorithm::declare_fields() (UnitTestAlgorithm.C:85)
==609638==    by 0x62890A: TestAlgorithm::fill_mesh(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (UnitTestAlgorithm.C:22)
==609638==    by 0x628AF4: TestTurbulenceAlgorithm::fill_mesh_and_init_fields(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (UnitTestAlgorithm.C:100)
==609638==    by 0x62A9CE: TestTurbulenceAlgorithm_turbviscsmagorinskyalgorithm_Test::TestBody() (UnitTestLESAlgorithms.C:22)
==609638==    by 0xB3DA806: HandleSehExceptionsInMethodIfSupported<testing::Test, void> (gtest-all.cc:3914)
==609638==    by 0xB3DA806: void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (gtest-all.cc:3950)
==609638==    by 0xB3CB695: Run (gtest-all.cc:3989)
==609638==    by 0xB3CB695: testing::Test::Run() (gtest-all.cc:3979)
==609638==    by 0xB3CB9A4: Run (gtest-all.cc:4165)
==609638==    by 0xB3CB9A4: testing::TestInfo::Run() (gtest-all.cc:4138)
==609638==    by 0xB3CBA93: Run (gtest-all.cc:4297)
==609638==    by 0xB3CBA93: testing::TestSuite::Run() (gtest-all.cc:4276)
==609638==    by 0xB3D0482: testing::internal::UnitTestImpl::RunAllTests() (gtest-all.cc:6819)
==609638==
marchdf commented 1 month ago

Basically the above indicates that. numStates is not initialized in static FieldDefTypes query(int numDim, int numStates, std::string name). Why would that be?

marchdf commented 1 month ago

My thinking now is that

sierra::nalu::FieldPointerTypes new_field =
      realm_->fieldManager_->register_field(name, universal, numStates);

has the wrong call signature in UnitTestAlgorithm.C

marchdf commented 1 month ago

Scratch that. Here's what's going on I think;

So what's up with that? Should the call be auto definition = FieldRegistry::query(numDimensions, numStates, name);? Or maybe the constructor of FieldManager isn't taking in the right args? Why is numStates not initialized?

djglaze commented 1 month ago

I think your diagnosis and questions are good. Unfortunately, I'm not sure I can answer them. It does seem odd that FieldManager::register_field uses the member numStates_ for that query instead of the passed-in numStates, but I can't claim to understand the logic of what it's doing there. Either way, numStates_ looks like it's initialized in the constructor of FieldManager, so I don't know why that's lighting up as uninitialized.

Perhaps @psakievich has some insight here. I think this is mostly his code.