E3SM-Project / e3sm_diags

E3SM Diagnostics package
https://e3sm-project.github.io/e3sm_diags
BSD 3-Clause "New" or "Revised" License
38 stars 31 forks source link

[Bug]: Silent bug in `adjust_prs_val_units()` conditional where `if prs_val0:` will be `False` if `prs_val0` is 0 #797

Open tomvothecoder opened 5 months ago

tomvothecoder commented 5 months ago

What happened?

Overview

There is a conditional in adjust_prs_val_units() that checks if prs_val0: to perform unit adjustment on the prs axis. This is a silent bug in this conditional where unit adjustment is not being performed when prs_val0 == 0 because 0 == False in boolean form.

Related code: https://github.com/E3SM-Project/e3sm_diags/blob/43f5f74057929c4bedddfab58be63f838ee36abe/e3sm_diags/derivations/acme.py#L433-L446

Example Case

There are several derived variables that define prs_low0=0. For example, notice how CLDLOW_TAU_1.3_9.4_MISR has a prs_low0=0 below. For the dataset I was working with, this value is never used due to the silent bug. No prs axis adjustments occur and the prs low value from the dataset (-0.001) is used which results in incorrect results.

https://github.com/E3SM-Project/e3sm_diags/blob/43f5f74057929c4bedddfab58be63f838ee36abe/e3sm_diags/derivations/acme.py#L1378-L1393

What did you expect to happen? Are there are possible answers you came across?

The conditional should be updated to if prs_val0 is not None: since prs_val0 is optional. This logic is implemented correctly on the dev branch, cdat-migration-fy24.

def adjust_prs_val_units(
    prs: "FileAxis", prs_val: float, prs_val0: Optional[float]
) -> float:
    """Adjust the prs_val units based on the prs.id"""
    # COSP v2 cosp_pr in units Pa instead of hPa as in v1
    # COSP v2 cosp_htmisr in units m instead of km as in v1
    adjust_ids = {"cosp_prs": 100, "cosp_htmisr": 1000}

    if prs_val0 is not None:
        prs_val = prs_val0
        if prs.id in adjust_ids.keys() and max(prs.getData()) > 1000:
            prs_val = prs_val * adjust_ids[prs.id]

    return prs_val

Minimal Complete Verifiable Example (MVCE)

No response

Relevant log output

No response

Anything else we need to know?

How I found it

In PR #794, I found the following variables had large diffs. I stepped through the code on main and the dev branch and found that main doesn't perform prs unit adjustment while the dev branch does.

Comparing:
    * /global/cfs/cdirs/e3sm/www/cdat-migration-fy24/792-lat-lon/lat_lon/Cloud MISR/MISRCOSP-CLDLOW_TAU1.3_9.4_MISR-ANN-global_ref.nc
    * /global/cfs/cdirs/e3sm/www/cdat-migration-fy24/main/lat_lon/Cloud MISR/MISRCOSP-CLDLOW_TAU1.3_9.4_MISR-ANN-global_ref.nc
    * var_key: CLDLOW_TAU1.3_9.4_MISR

Not equal to tolerance rtol=1e-05, atol=0

Mismatched elements: 39457 / 64800 (60.9%)
Max absolute difference: 22.411116
Max relative difference: 0.6832267
 x: array([[nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],...
 y: array([[nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],...
Comparing:
    * /global/cfs/cdirs/e3sm/www/cdat-migration-fy24/792-lat-lon/lat_lon/Cloud MISR/MISRCOSP-CLDLOW_TAU1.3_9.4_MISR-ANN-global_test.nc
    * /global/cfs/cdirs/e3sm/www/cdat-migration-fy24/main/lat_lon/Cloud MISR/MISRCOSP-CLDLOW_TAU1.3_9.4_MISR-ANN-global_test.nc
    * var_key: CLDLOW_TAU1.3_9.4_MISR

Not equal to tolerance rtol=1e-05, atol=0

Mismatched elements: 9 / 64800 (0.0139%)
Max absolute difference: 0.0970192
Max relative difference: 0.01244658
 x: array([[ 0.056777,  0.056777,  0.056777, ...,  1.274017,  1.274017,
         1.274017],
       [ 0.207892,  0.207774,  0.207536, ...,  1.675944,  1.676576,...
 y: array([[ 0.056777,  0.056777,  0.056777, ...,  1.274017,  1.274017,
         1.274017],
       [ 0.207892,  0.207774,  0.207536, ...,  1.675944,  1.676576,...
Comparing:
    * /global/cfs/cdirs/e3sm/www/cdat-migration-fy24/792-lat-lon/lat_lon/Cloud MISR/MISRCOSP-CLDLOW_TAU1.3_9.4_MISR-JJA-global_ref.nc
    * /global/cfs/cdirs/e3sm/www/cdat-migration-fy24/main/lat_lon/Cloud MISR/MISRCOSP-CLDLOW_TAU1.3_9.4_MISR-JJA-global_ref.nc
    * var_key: CLDLOW_TAU1.3_9.4_MISR

Not equal to tolerance rtol=1e-05, atol=0

Mismatched elements: 34699 / 64800 (53.5%)
Max absolute difference: 45.429226
Max relative difference: 0.9708206
 x: array([[nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],...
 y: array([[nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],...

------------------------------------------------------------------------------------------------------------------------------------------------

Comparing:
    * /global/cfs/cdirs/e3sm/www/cdat-migration-fy24/792-lat-lon/lat_lon/Cloud MISR/MISRCOSP-CLDLOW_TAU1.3_MISR-ANN-global_ref.nc
    * /global/cfs/cdirs/e3sm/www/cdat-migration-fy24/main/lat_lon/Cloud MISR/MISRCOSP-CLDLOW_TAU1.3_MISR-ANN-global_ref.nc
    * var_key: CLDLOW_TAU1.3_MISR

Not equal to tolerance rtol=1e-05, atol=0

Mismatched elements: 39499 / 64800 (61%)
Max absolute difference: 37.673122
Max relative difference: 0.62295455
 x: array([[nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],...
 y: array([[nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],...
Comparing:
    * /global/cfs/cdirs/e3sm/www/cdat-migration-fy24/792-lat-lon/lat_lon/Cloud MISR/MISRCOSP-CLDLOW_TAU1.3_MISR-ANN-global_test.nc
    * /global/cfs/cdirs/e3sm/www/cdat-migration-fy24/main/lat_lon/Cloud MISR/MISRCOSP-CLDLOW_TAU1.3_MISR-ANN-global_test.nc
    * var_key: CLDLOW_TAU1.3_MISR

Not equal to tolerance rtol=1e-05, atol=0

Mismatched elements: 9 / 64800 (0.0139%)
Max absolute difference: 0.0970192
Max relative difference: 0.00541773
 x: array([[5.677656e-02, 5.677656e-02, 5.677656e-02, ..., 1.274017e+00,
        1.274017e+00, 1.274017e+00],
       [2.078919e-01, 2.077735e-01, 2.075364e-01, ..., 1.675944e+00,...
 y: array([[5.677656e-02, 5.677656e-02, 5.677656e-02, ..., 1.274017e+00,
        1.274017e+00, 1.274017e+00],
       [2.078919e-01, 2.077735e-01, 2.075364e-01, ..., 1.675944e+00,...
Comparing:
    * /global/cfs/cdirs/e3sm/www/cdat-migration-fy24/792-lat-lon/lat_lon/Cloud MISR/MISRCOSP-CLDLOW_TAU1.3_MISR-JJA-global_ref.nc
    * /global/cfs/cdirs/e3sm/www/cdat-migration-fy24/main/lat_lon/Cloud MISR/MISRCOSP-CLDLOW_TAU1.3_MISR-JJA-global_ref.nc
    * var_key: CLDLOW_TAU1.3_MISR

Not equal to tolerance rtol=1e-05, atol=0

Mismatched elements: 35149 / 64800 (54.2%)
Max absolute difference: 67.89603
Max relative difference: 0.9691263
 x: array([[nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],...
 y: array([[nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],...

------------------------------------------------------------------------------------------------------------------------------------------------

Comparing:
    * /global/cfs/cdirs/e3sm/www/cdat-migration-fy24/792-lat-lon/lat_lon/Cloud MISR/MISRCOSP-CLDLOW_TAU9.4_MISR-ANN-global_ref.nc
    * /global/cfs/cdirs/e3sm/www/cdat-migration-fy24/main/lat_lon/Cloud MISR/MISRCOSP-CLDLOW_TAU9.4_MISR-ANN-global_ref.nc
    * var_key: CLDLOW_TAU9.4_MISR

Not equal to tolerance rtol=1e-05, atol=0

Mismatched elements: 39323 / 64800 (60.7%)
Max absolute difference: 31.085188
Max relative difference: 0.96666664
 x: array([[nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],...
 y: array([[nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],...

Comparing:
    * /global/cfs/cdirs/e3sm/www/cdat-migration-fy24/792-lat-lon/lat_lon/Cloud MISR/MISRCOSP-CLDLOW_TAU9.4_MISR-JJA-global_ref.nc
    * /global/cfs/cdirs/e3sm/www/cdat-migration-fy24/main/lat_lon/Cloud MISR/MISRCOSP-CLDLOW_TAU9.4_MISR-JJA-global_ref.nc
    * var_key: CLDLOW_TAU9.4_MISR

Not equal to tolerance rtol=1e-05, atol=0

Mismatched elements: 33486 / 64800 (51.7%)
Max absolute difference: 63.126827
Max relative difference: 1.
 x: array([[nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],...
 y: array([[nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],...

Affects the following derived variables:

https://github.com/E3SM-Project/e3sm_diags/blob/43f5f74057929c4bedddfab58be63f838ee36abe/e3sm_diags/derivations/acme.py#L1362-L1409

Environment

main and all versions of e3sm_diags that has cosp_bin_sum()

tomvothecoder commented 5 months ago

These are the kinds of issues that can be easy to miss, which is why linters and type checkers (e.g., flake8 and mypy) are valuable and great to use as preventative tools.