geoschem / integrated_methane_inversion

Integrated Methane Inversion workflow repository.
https://imi.readthedocs.org
MIT License
26 stars 24 forks source link

TROPOMI operator cannot process retrievals that cross over into a time outside of the current inversion period #211

Closed megan-he closed 3 months ago

megan-he commented 5 months ago

Name: Megan He Institution: Harvard University

For a given inversion period, if a TROPOMI retrieval file crosses over into a time outside the current period of interest, applying the TROPOMI operator in jacobian.py fails because this date does not exist in the GEOS-Chem cache src/inversion/data_geoschem. For example, in a monthly Kalman filter inversion for period 20190301 - 20190331, this retrieval file S5P_BLND_L2__CH4____20190331T223508_20190401T001638_07584_03_020400_20230614T025222.nc crosses into the early hours of April.

We should filter out observations that are on the end day of the inversion period (e.g. 20190401) rather than skipping the whole file.

megan-he commented 3 months ago

Can we use floor rounding instead of rounding to the nearest hour in TROPOMI_operator.py? For example, the TROPOMI file S5P_BLND_L2__CH4____20190331T223508_20190401T001638_07584_03_020400_20230614T025222.nc has multiple observations at 2019-03-31T23:46 that get rounded to 2019-04-01T00:00, which is the source of the bug above. https://github.com/geoschem/integrated_methane_inversion/blob/600057807ae827781239418e50c05a5b27645aae/src/inversion_scripts/operators/TROPOMI_operator.py#L262 https://github.com/geoschem/integrated_methane_inversion/blob/600057807ae827781239418e50c05a5b27645aae/src/inversion_scripts/operators/TROPOMI_operator.py#L284 https://github.com/geoschem/integrated_methane_inversion/blob/600057807ae827781239418e50c05a5b27645aae/src/inversion_scripts/operators/TROPOMI_operator.py#L764

@laestrada @djvaron @nicholasbalasus

nicholasbalasus commented 3 months ago

Hi Megan, this makes sense to me. To make sure I understand...

In jacobian.py, the TROPOMI operator is called (for example, the average TROPOMI operator): https://github.com/geoschem/integrated_methane_inversion/blob/600057807ae827781239418e50c05a5b27645aae/src/inversion_scripts/jacobian.py#L36-L47

The observations that make it through are defined by the filter functions like this: https://github.com/geoschem/integrated_methane_inversion/blob/600057807ae827781239418e50c05a5b27645aae/src/inversion_scripts/operators/TROPOMI_operator.py#L71

which includes observations fully bounded by the GC start and end dates: https://github.com/geoschem/integrated_methane_inversion/blob/600057807ae827781239418e50c05a5b27645aae/src/inversion_scripts/utils.py#L318-L329

However, an observation within the GC start and end dates can be rounded to be equal to the end date 00 hour which is problematic because GC does not output for this hour. But if we use floor, this is avoided. Does this sound right?

megan-he commented 3 months ago

Hi Nick, yes that's right. I ran the operator after changing the instances of .round() to .floor() and it was able to process that particular TROPOMI file.

laestrada commented 3 months ago

Hi @megan-he,

thanks for bringing this up. I think we should just add some logic that checks it doesn't cross outside the current time period of interest. Maybe add a function like:

def get_strdate(current_time, date_threshold):
    # round observation time to nearest hour
    strdate = time.round("60min").strftime("%Y%m%d_%H")
    # Unless it equals the date threshold (hour 00 after the inversion period)
    if strdate == date_threshold:
        strdate = time.floor("60min").strftime("%Y%m%d_%H")

    return strdate
megan-he commented 3 months ago

Hi Lucas, that makes sense!