parthenon-hpc-lab / parthenon

Parthenon AMR infrastructure
https://parthenon-hpc-lab.github.io/parthenon/
Other
105 stars 33 forks source link

test failures with UndefinedBehaviorSanitizer enabled #1107

Closed BenWibking closed 2 weeks ago

BenWibking commented 3 weeks ago

When running with -DENABLE_ASAN=ON (https://github.com/parthenon-hpc-lab/parthenon/pull/1106), the following tests fail:

The following tests FAILED:
     30 - NaN payload tagging (Subprocess aborted)
     60 - regression_test:restart (Failed)
     61 - regression_mpi_test:restart (Failed)
     62 - regression_test:calculate_pi (Failed)
     63 - regression_mpi_test:calculate_pi (Failed)
     64 - regression_test:advection_convergence (Failed)
     65 - regression_mpi_test:advection_convergence (Failed)
     66 - regression_test:output_hdf5 (Failed)
     67 - regression_mpi_test:output_hdf5 (Failed)
     68 - regression_test:advection_outflow (Failed)
     69 - regression_mpi_test:advection_outflow (Failed)
     72 - regression_test:poisson (Failed)
     73 - regression_mpi_test:poisson (Failed)
     74 - regression_test:poisson_gmg (Failed)
     75 - regression_mpi_test:poisson_gmg (Failed)
     76 - regression_test:sparse_advection (Failed)
     77 - regression_mpi_test:sparse_advection (Failed)

Full output log: asan_ctest_output.txt

BenWibking commented 3 weeks ago

All of the errors appear to be undefined behavior in C++:

/Users/benwibking/parthenon/tst/unit/test_nan_tags.cpp:55:21: runtime error: division by zero
/Users/benwibking/parthenon/src/mesh/forest/logical_location.cpp:121:24: runtime error: left shift of negative value -1
b'/Users/benwibking/parthenon/src/mesh/forest/logical_location.cpp:121:24: runtime error: left shift of negative value -1
[Bens-MacBook-Pro:83977] [ 6] /Users/benwibking/parthenon/src/mesh/forest/logical_location.cpp:121:24: runtime error: left shift of negative value -1
[Bens-MacBook-Pro:83977] [ 8] /Users/benwibking/parthenon/src/mesh/forest/logical_location.cpp:121:24: runtime error: left shift of negative value -1
[Bens-MacBook-Pro:83977] [10] /Users/benwibking/parthenon/src/mesh/forest/logical_location.cpp:121:24: runtime error: left shift of negative value -1
/Users/benwibking/parthenon/src/mesh/forest/logical_location.cpp:121:24: runtime error: left shift of negative value -1
b'/Users/benwibking/parthenon/src/mesh/forest/logical_location.cpp:121:24: runtime error: left shift of negative value -1
/Users/benwibking/parthenon/src/mesh/forest/logical_location.cpp:121:24: runtime error: left shift of negative value -1
/Users/benwibking/parthenon/src/mesh/forest/logical_location.cpp:121:24: runtime error: left shift of negative value -1
[Bens-MacBook-Pro:83988] [ 0] /Users/benwibking/parthenon/src/mesh/forest/logical_location.cpp:121:24: runtime error: left shift of negative value -1
/Users/benwibking/parthenon/src/mesh/forest/logical_location.cpp:121:24: runtime error: left shift of negative value -1
b'/Users/benwibking/parthenon/src/mesh/forest/logical_location.cpp:121:24: runtime error: left shift of negative value -1
/Users/benwibking/parthenon/src/mesh/forest/logical_location.cpp:121:24: runtime error: left shift of negative value -1
/Users/benwibking/parthenon/src/mesh/forest/logical_location.cpp:121:24: runtime error: left shift of negative value -1
/Users/benwibking/parthenon/src/mesh/forest/logical_location.cpp:121:24: runtime error: left shift of negative value -1
/Users/benwibking/parthenon/src/mesh/forest/logical_location.cpp:121:24: runtime error: left shift of negative value -1
/Users/benwibking/parthenon/src/mesh/forest/logical_location.cpp:121:24: runtime error: left shift of negative value -1
/Users/benwibking/parthenon/src/mesh/forest/logical_location.cpp:121:24: runtime error: left shift of negative value -1
/Users/benwibking/parthenon/src/mesh/forest/logical_location.cpp:121:24: runtime error: left shift of negative value -1
/Users/benwibking/parthenon/src/mesh/forest/logical_location.cpp:121:24: runtime error: left shift of negative value -1
/Users/benwibking/parthenon/src/mesh/forest/logical_location.cpp:121:24: runtime error: left shift of negative value -1
b'/Users/benwibking/parthenon/src/mesh/forest/logical_location.cpp:121:24: runtime error: left shift of negative value -1
/Users/benwibking/parthenon/src/mesh/forest/logical_location.cpp:121:24: runtime error: left shift of negative value -1
/Users/benwibking/parthenon/src/mesh/forest/logical_location.cpp:121:24: runtime error: left shift of negative value -1
/Users/benwibking/parthenon/src/mesh/forest/logical_location.cpp:121:24: runtime error: left shift of negative value -1
/Users/benwibking/parthenon/example/poisson/poisson_package.cpp:299:52: runtime error: division by zero
/Users/benwibking/parthenon/example/poisson/poisson_package.cpp:299:52: runtime error: division by zero
/Users/benwibking/parthenon/example/poisson/poisson_package.cpp:299:52: runtime error: division by zero
/Users/benwibking/parthenon/example/poisson/poisson_package.cpp:299:52: runtime error: division by zero
/Users/benwibking/parthenon/example/poisson/poisson_package.cpp:299:52: runtime error: division by zero
/Users/benwibking/parthenon/src/solvers/mg_solver.hpp:249:44: runtime error: division by zero
/Users/benwibking/parthenon/src/solvers/mg_solver.hpp:249:44: runtime error: division by zero
/Users/benwibking/parthenon/src/solvers/mg_solver.hpp:249:44: runtime error: division by zero
/Users/benwibking/parthenon/src/mesh/forest/logical_location.cpp:121:24: runtime error: left shift of negative value -1
b'/Users/benwibking/parthenon/src/mesh/forest/logical_location.cpp:121:24: runtime error: left shift of negative value -1
/Users/benwibking/parthenon/src/mesh/forest/logical_location.cpp:121:24: runtime error: left shift of negative value -1
/Users/benwibking/parthenon/src/mesh/forest/logical_location.cpp:121:24: runtime error: left shift of negative value -1
[Bens-MacBook-Pro:84754] [ 5] /Users/benwibking/parthenon/src/mesh/forest/logical_location.cpp:121:24: runtime error: left shift of negative value -1
BenWibking commented 3 weeks ago

Division by zero errors:

BenWibking commented 3 weeks ago

"Left shift of negative value -1" error:

lroberts36 commented 3 weeks ago

"Left shift of negative value -1" error:

* https://github.com/parthenon-hpc-lab/parthenon/blob/ff02625bd00dc81e2702889041098f1f05522f2b/src/mesh/forest/logical_location.cpp#L121

should have a PR up for this one in a minute

lroberts36 commented 3 weeks ago

Interestingly, when I ran with -DENABLE_ASAN=On on a Darwin Skylake-Gold node it also found a shared_ptr cycle. Should have a fix for that soon as well.