GEOS-ESM / NDSL

NOAA NASA Domain Specific Language middleware layer
0 stars 0 forks source link

Enhance the "metric" calculation for Translate test at f32 #56

Closed FlorianDeconinck closed 1 day ago

FlorianDeconinck commented 1 month ago

Physics parametrization in GEOS are all run at 32-bit precision. The original Translate structure, and it's metric calculation to judge if an error is small enough for the test to pass, were designed with a 64-bit float code.

The metric looks at of order of magnitude normalized to the value. E.g.:

The actual calculation is in ndsl/testing/comparison.py

This methods allow a semblance of normalization across different amplitudes but suffer for very small values. If this issue was less pro-eminent at 64-bit it becomes an issue at 32-bit. Even more, the physics deal with very small variation in amplitude, where a single fused-multiply-add could swap results from one architecture to the next.

We need to come up with a series of checks, to distinguish true error from f32 noise.

Parent: https://github.com/GEOS-ESM/SMT-Nebulae/issues/41


FlorianDeconinck commented 1 month ago

Relative error at f32 should never be more than e-8/e-9 so we could already swap our threshold depending on the precision.

A better scientific noise/signal check would be to perturbate the inputs on a validating CPU then we would get a real distribution of the potential errors and we could check GPU versus that error.

FlorianDeconinck commented 3 weeks ago

Branch: https://github.com/NOAA-GFDL/NDSL/tree/feature/translate_test_f32

FlorianDeconinck commented 2 weeks ago

PR bringing a multi-modal metric: https://github.com/NOAA-GFDL/NDSL/pull/67

FlorianDeconinck commented 1 day ago

Merged and available via --multimodal_metric when running pytest based translate tests