NOAA-EMC / JEDI-T2O

JEDI Transition to Operations activities.
GNU Lesser General Public License v2.1
5 stars 4 forks source link

Aircraft UFO evaluation #92

Open ADCollard opened 1 year ago

ADCollard commented 1 year ago

This issue is splitting from https://github.com/NOAA-EMC/JEDI-T2O/issues/60 so the sonde and aircraft validation does not get confused.

The starting point for this is the GMAO YAML, but there are many issues with these it appears.

The test YAMLs used here are in this branch.

ADCollard commented 1 year ago

Initial validation does not look good. HofX is OK except for temperature where the bias correction is not being applied properly it seems. But the observation errors are a complete mess.

hofxdiff_vs_pressure_diag_aircraft_2021080100_Aircraft_airTemperature hofxdiff_vs_pressure_diag_aircraft_2021080100_Aircraft_specificHumidity hofxdiff_vs_pressure_diag_aircraft_2021080100_Aircraft_windEastward hofxdiff_vs_pressure_diag_aircraft_2021080100_Aircraft_windNorthward errordiff_vs_pressure_diag_aircraft_2021080100_Aircraft_airTemperature errordiff_vs_pressure_diag_aircraft_2021080100_Aircraft_specificHumidity errordiff_vs_pressure_diag_aircraft_2021080100_Aircraft_windEastward errordiff_vs_pressure_diag_aircraft_2021080100_Aircraft_windNorthward

BrettHoover-NOAA commented 1 year ago

The issue with reassigning the value of hofx/airTemperature to a bias-corrected value is extremely picky. If I do the Variable Assignment to a new variable HofXBC/airTemperature, the value of hofx/airTemperature remains not-bias-corrected but the new variable is bias-corrected. If I try to bias-correct hofx/airTemperature directly and write the result of the LinearCombination function to the variable, no result (remains not-bias-corrected). If I do the bias-correction to HofXBC/airTemperature and then reassign hofx/airTemperature to the bias-corrected values using another Variable Asssignment filter:

# Use ObsFunction/LinearCombination to combine ObsValue/airTemperature with the
# BiasCorrectionPredictor/<predictor> variables to produce a bias-corrected
# HofX
- filter: Variable Assignment
  assignments:
  - name: HofXBC/airTemperature
    type: float
    function:
      name: ObsFunction/LinearCombination
      options:
        variables: [HofX/airTemperature, BiasCorrectionTerm/constantPredictor, BiasCorrectionTerm/ascentPredictor, BiasCorrectionTerm/ascentSquaredPredictor]
        coefs: [1.0, 1.0, 1.0, 1.0]

- filter: Variable Assignment
  assignments:
  - name: hofx/airTemperature
    type: float
    source variable: HofXBC/airTemperature

The diag-file produces a bias-corrected HofXBC/airTemperature but hofx/airTemperature remains not-bias-corrected.

BrettHoover-NOAA commented 10 months ago

I was able to get UFO aircraft airTemperature acceptance to match GSI using the following method:

  1. The ObsError/airTemperature value is first directly borrowed from GSI using ObsFunction/LinearOperator in a Variable Assignment filter:
    # Make a copy of GsiAdjustObsError called GsiAdjustObsError2, for storing inflated error from ObsErrorFactorPressureCheck
    - filter: Variable Assignment
    assignments:
    - name: GsiAdjustObsError2/airTemperature
    type: float  # type must be specified if the variable doesn't already exist
    source variable: GsiAdjustObsError/airTemperature
    # Reassign obs error to copy of GsiAdjustObsError
    - filter: Perform Action
    filter variables:
    - name: airTemperature
    action:
    name: assign error
    error function: 
      name: ObsFunction/LinearCombination
      options:
        variables: [GsiAdjustObsError2/airTemperature]
        coefs: [1.0]

    There are two steps in the YAML here, one to assign a new variable GsiAdjustObsError2/airTemperature and then use it to assign ObsError/airTemperature. Ultimately this is probably not explicitly necessary and we could probably assign GsiAdjustObsError/airTemperature directly, but for full disclosure this is the current version of my YAML code.

It is also worth recognizing that this is a shortcut that will have to eventually be replaced with YAML filters that can properly assign the "adjust" error values without relying on GSI data.

  1. The ObsError/airTemperature is inflated based on multiple criteria: ObsFunction/ObsErrorFactorPressureCheck is used to inflate errors based on the observation being close to the top or bottom level of the model, and an error inflation of factor 1.2 is applied for any observation with all zero-values for bias-correction coefficients (in offline bias correction, this will apply equally to observations with all zero-values in the corresponding GSI bias correction coefficient file, and observations that do not appear in the GSI bias correction coefficient file that are automatically assigned zero values for all coefficients by ObsFunction/DrawValueFromFile):
    
    # Inflate ob-error based on pressure check (penalize observations too close to model top or bottom)
    - filter: Perform Action
    filter variables:
    - name: airTemperature
    action:
    name: inflate error
    inflation variable:
      name: ObsFunction/ObsErrorFactorPressureCheck
      options:
        variable: airTemperature
        inflation factor: 8.0  # 4.0 for wind observations, 8.0 for temperature/humidity observations
        geovar_sfc_geomz: surface_altitude 
        test_qcflag: PreUseFlag

inflate error for tails with uninitialized bias coefficients (i.e. all coefficients equal to zero)

NOTE: Since the ObsFunction/DrawValueFromFile system used in offline bias correction assigns a zero

to all predictor coefficients for a tail-number not in the coefficients file, this filter will

correctly apply a 1.2 factor to the observation error for both unknown tail-numbers as well as

tail-numbers that are "new" (registered in coefficients file but have yet to be assigned

coefficients other than the initial zero-values)

  1. Guard-rails are applied to ObsError/airTemperature to conform to the provided minimum and maximum allowable values for the gross-error check, and the gross-error check coefficient is applied, using ObsFunction/ObsErrorBoundConventional. These values are saved to a new variable ObsErrorBound/airTemperature using the Variable Assignment filter. These values are then adjusted for any observation with a PreQC/airTemperature value of 3 to use a penalized gross-error check coefficient that is multiplied by 0.7:
    # Apply guard-rails to airTemperature ObsErrors and write to a separate variable
    - filter: Variable Assignment
    assignments:
    - name: ObsErrorBound/airTemperature
    type: float
    function:
      name: ObsFunction/ObsErrorBoundConventional
      options:
        obsvar: airTemperature
        obserr_bound_min: 1.3
        obserr_bound_max: 5.6
        obserr_bound_factor: 7.0  # cgross
    # Penalize obserr_bound_factor by factor of 0.7 if PreQC/airTemperature is 3
    - filter: Variable Assignment
    where:
    - variable: PreQC/airTemperature
    is_in: [3]
    assignments:
    - name: ObsErrorBound/airTemperature
    type: float
    function:
      name: ObsFunction/ObsErrorBoundConventional
      options:
        obsvar: airTemperature
        obserr_bound_min: 1.3
        obserr_bound_max: 5.6
        obserr_bound_factor: 4.9  # cgross
  2. The gross-error check is performed using the Bounds Check filter and using ObsErrorBound/airTemperature for an absolute threshold value, and HofXBc is passed as test_hofx to utilize the offline bias corrected values of HofX in the comparison:
    # Gross check as a Background Check, utilizing threshold applied to ObsErrorBound/airTemperature
    - filter: Background Check
    filter variables:
    - name: airTemperature
    function absolute threshold:
    - name: ObsErrorBound/airTemperature
    test_hofx: HofXBc
    action:
    name: reject

    A 100% acceptance between UFO and GSI is reached.

BrettHoover-NOAA commented 10 months ago

To get UFO "final" errors to match GSI "final" errors, two additional error inflation filters are applied after the gross-error check is completed: (1) the ObsFunction/ObsErrorFactorDuplicateCheck is applied to inflate errors on so-called "duplicate" observations, and (2) an across-the-board factor of 1.5 is applied to match the inflation that takes place in GSI/setupt.f90 Line 1038: if(nvqc .and. ibeta(ikx) >0 ) ratio_errors=0.8_r_kind*ratio_errors This factor of 0.8 is applied to the inverse-error, equating to a factor of 1.25 to the error.

# Duplicate factor
- filter: Perform Action
  filter variables:
  - name: airTemperature
  action:
    name: inflate error
    inflation variable:
      name: ObsFunction/ObsErrorFactorDuplicateCheck
      options:
        use_air_pressure: true
        variable: airTemperature
# Additional error inflation of 1.25, as per GSI/setupt.f90 Line 1038:
#
# if(nvqc .and. ibeta(ikx) >0  ) ratio_errors=0.8_r_kind*ratio_errors
#
# This 0.8 factor is applied to the inverse-error, thus inflation if 1.0/0.8=1.25
- filter: Perform Action
  filter variables:
  - name: airTemperature
  action:
    name: inflate error
    inflation factor: 1.25

Applying these filters, the errors match for all aircraft airTemperature types: Screenshot 2024-01-30 at 3 20 38 PM

BrettHoover-NOAA commented 10 months ago

I believe that a change needs to be made to ObsFunction/ObsErrorFactorConventional in order to work with aircraft data.

When attempting to define aircraft airTemperature observation errors from scratch and apply this check after assigning errors from the table, there are numerous observations that end up with errors on the order of E+18 that are assigned more realistic values when computing the final errors by using the GSI's "adjust" errors as a starting point:

Screenshot 2024-01-31 at 3 59 02 PM

Here I am plotting the final errors using the GSI "adjust" errors as a starting point on the x-axis and the final errors using the conventional error inflation on the "initial" (table) errors on the y-axis, broken out by aircraft airTemperature observation type.

I included some write statements in ObsFunction/ObsErrorFactorConventional and discovered that for all of these outlier observations, the top- and bottom-pressure values for the layer, defined by pdiffu and 'pdiffdin the function, are both0`.

I noticed in the GSI/qcmod.f90 errormod_aircraft function that when a single-level case is discovered, the function skips this observation and does not apply inflation, L 873-874:

    errout=one
    if(levs == 1)return

If I apply the same logic to ObsFunction/ObsErrorFactorConventional and include a line that reverts the value of error_factor to a value of 1 in the case where both pdiffu and pdiffd are 0:

        // When there are multiple observations inside the same model interval, the error_factor
        // will be bigger than 1 based on the spacing of the these observations
        pdifftotal = std::max(pdiffd+pdiffu, 5.0f * tiny_float);
        error_factor = sqrt(2.0f*vmag/pdifftotal);

        // Output
        oops::Log::info() << "BTH: conv error inflation " << rSort[thisPoint] <<
                     " " << pdiffd << " " << pdiffu << " " << pdifftotal << " " <<  
                     vmag << " " << error_factor << std::endl;
        // BTH: Modify to return error_factor = 1 if pdiffd==pdiffu==0
        if (pdiffd == 0.0 && pdiffu == 0.0) {
          error_factor = 1.0;
        }
        // BTH: End
        obserr[ivar][rSort[thisPoint]] = error_factor;
      }
    }

The resulting comparison between the final errors using GSI "adjust" errors as a crutch versus computing them from scratch looks much better:

Screenshot 2024-01-31 at 4 06 48 PM

It's possible that this is not an issue specific to aircraft observations, since the errormod function (not the errormod_aircraft version) in GSI/qcmod.f90 also has this if (levs == 1) return mechanic.

BrettHoover-NOAA commented 10 months ago

Further testing of ObsFunction/ObsErrorFactorPressureCheck: If I further constrain error_factor by removing inflation on any observation with pdifftotal=pdiffu+pdiffd less than 0.1, the fit is slightly better:

Screenshot 2024-02-01 at 9 50 20 AM

This might be a round-about way of addressing outstanding issues between GSI and UFO conventional ob error inflation, but there appears to be a dial here in how pdiffu and pdiffd are related to each other that is controlling fit.