E3SM-Project / scream

Fork of E3SM used to develop exascale global atmosphere model written in C++
https://e3sm-project.github.io/scream/
Other
80 stars 55 forks source link

DEBUG build required for tests to pass #525

Closed pbosler closed 3 years ago

pbosler commented 4 years ago

2 programs, which account for 17 tests (the shoc test below is run with OMP_NUM_THREADS=1...16) fail with CMAKE_BUILD_TYPE=RELEASE and CMAKE_BUILD_TYPE=RelWithDebInfo, but they pass with CMAKE_BUILD_TYPE=DEBUG. Same results with both gcc6.5 and gcc7.3.

Program 1: ekat/tests/utils/debug_tools.cpp, error from LastTest.log

/ascldap/users/pabosle/scream/externals/ekat/tests/utils/debug_tools_tests.cpp:150: FAILED:
  REQUIRE( run_fpe_tests () == num_expected_fpes )
with expansion:
  0 == 1

*) testing user-disabled fpes...
 tests mask: 57
   has FE_DIVBYZERO: 0
   has FE_INVALID:   1
   has FE_OVERFLOW:  1
  - Invalid arg threw.
  - Overflow threw.
-------------------------------------------------------------------------------
fpes
  user-requested-fpes
-------------------------------------------------------------------------------
/ascldap/users/pabosle/scream/externals/ekat/tests/utils/debug_tools_tests.cpp:159
...............................................................................

/ascldap/users/pabosle/scream/externals/ekat/tests/utils/debug_tools_tests.cpp:168: FAILED:
  REQUIRE( run_fpe_tests () == num_expected_fpes )
with expansion:
  2 == 3

Program 2: scream/src/physics/tests/shoc_varorcovar.cpp, error from LastTest.log

-------------------------------------------------------------------------------
shoc_varorcovar_property
-------------------------------------------------------------------------------
/ascldap/users/pabosle/scream/components/scream/src/physics/shoc/tests/shoc_varorcovar_tests.cpp:281
...............................................................................

/ascldap/users/pabosle/scream/components/scream/src/physics/shoc/tests/shoc_varorcovar_tests.cpp:205: FAILED:
  REQUIRE( SDS.varorcovar[offset] < 0.0 )
with expansion:
  100.0 < 0.0

===============================================================================
test cases:   48 |   47 passed | 1 failed
assertions: 2981 | 2980 passed | 1 failed

KokkosP: Kernel timing written to /home/pabosle/scream/components/scream/build_master/src/physics/shoc/tests/s1024454-24861.dat 
<end of output>
Test time =   0.22 sec
----------------------------------------------------------
Test Failed.
"shoc_tests_ut_np1_omp1" end time: Aug 25 16:38 MDT
"shoc_tests_ut_np1_omp1" time elapsed: 00:00:00
----------------------------------------------------------
bartgol commented 4 years ago

We currently do not support unit tests in release builds. The debug_tools test might fail cause some optimization cause the fpe to not be thrown. The other one is probably failing because of some non bfb optimizations? I'm not sure.

Are the other unit tests all passing? That's quite remarkable. If so, we may wonder whether we want to have a release build in our PR or Nightly testing. @jgfouca ?

Edit: by "not support" I mean that we do not run any test in either our nightly or PR jenkins jobs. But it is something we should probably consider (even if such build had fewer tests).

pbosler commented 4 years ago

Yes, these are the only 17 tests that fail in a release build (openmp).

bartgol commented 4 years ago

@pbosler did you try also a build with only Serial as enabled Kokkos backend?

pbosler commented 4 years ago

With serial, only the ekat debug_tools test fails. The shoc_varorcovar.cpp tests pass.

edit: with CMAKE_BUILD_TYPE=RELEASE

PeterCaldwell commented 3 years ago

@bartgol - this looks like an accute issue which must have gotten fixed. Can we close?

bartgol commented 3 years ago

I don't know. I think tests might still fail in non-debug builds. Perhaps we should add some logic to disable tests in non-debug builds? @jgfouca what do you think?

jgfouca commented 3 years ago

@bartgol , yes, I think we should probably expect ctest to work for a release build. I will look into it. This change will be necessary if we ever add optimized builds to test-all-scream.