eagles-project / mam4xx

A C++ implementation of MAM4
https://eagles-project.github.io/mam4xx/
Other
7 stars 6 forks source link

conversions_unit_tests fail in single-precision builds on PNNL CI machine(s) #93

Closed jeff-cohere closed 1 year ago

jeff-cohere commented 1 year ago

Briefly:

The PNNL CI pipeline set up by @CameronRutherford in PR #64 seems to be working properly, but we have a failing test when building with single precision. We decided to merge the PR and log this test failure so we could benefit from having a GPU machine in our automated test setup.

Here's the error (with CI kibble removed):

...
/people/svceagles/gitlab/3430/haero_Debug_single/src/tests/conversions_unit_tests.cpp:23: FAILED:
due to unexpected exception with message:
/people/svceagles/gitlab/3430/haero_Debug_single/src/tests/atmosphere_utils.cpp:50: FAIL: FloatingPoint<Real>::equiv( psum, p0, std::numeric_limits<float>::epsilon())
...

Evidently in this environment, psum and p0 are not equivalent within machine precision in a single-precision build. From @jaelynlitz:

This is what the atmosphere_utils test is putting out in single precision GPU Debug mode: psum = 99999.992188 p0 = 100000.000000 And the tolerance needed for the test to pass: tol = 65550 * std::numeric_limits::epsilon(); epsilon = 0.000000 tol = 0.007814

See #64 (in particular, click on the red X for the failed pipeline at the bottom of the page) for more details.

pressel commented 1 year ago

Thanks, @jeff-cohere!

jeff-cohere commented 1 year ago

I've implemented a temporary workaround (#97) that allows the CI setup to finish successfully while we address this issue.