metno / pyaerocom

Python tools for climate and air quality model evaluation
https://pyaerocom.readthedocs.io/
GNU General Public License v3.0
26 stars 15 forks source link

use_obs_clim is broken and has always been: Turns all obsdata into nan #1125

Open Ovewh opened 7 months ago

Ovewh commented 7 months ago

Describe the bug Please provide a clear and concise description of what the bug is.

To Reproduce My config.py file:

output_dir = "/lustre/storeB/users/oveh/DURF/aeroval/data"
coldata_dir = "/lustre/storeB/users/oveh/DURF/aeroval/coldala"

exp_pi = "Ove Haugvaldstad"
experiment_id="test simulations DURF"
proj_id = "AeroCom"

ALTITUDE_FILTER = {
    'altitude': [0, 1000]
} 

""" Ground based Aeront observations """

OBS_GROUNDBASED = {

    'AeronetSDAV3L2': dict(obs_id='AeronetSDAV3Lev2.daily',
                           # obs_vars=['od550aer', 'ang4487aer'],
                           obs_vars=['od550gt1aer','ang4487aer'],
                           obs_vert_type='Column',
                           obs_filters={**ALTITUDE_FILTER,
                                         **dict(station_name='DRAGON*', negate='station_name')},
                           min_num_obs={'monthly': {'daily': 7}},
                           obs_use_climatology=True,
                           obs_outlier_ranges={'od550aer'    : [0.01, 10],
                                                'od550lt1aer' : [0.01, 10],
                                                'od550gt1aer' : [0.01, 10]},

                           ),

}

MODELS = {
    "NorESM2.1F-LM histSST" : dict(
        model_id="NorESM2-LM-histSST_DURF",
        model_data_dir="/lustre/storeB/project/aerocom/aerocom-users-database/DURF/histSST/NorESM2-LM-histSST_DURF",
        model_use_vars={'od550gt1aer':'od550dust'},
        model_ts_type_read = 'monthly',
    ),

}

CFG = dict(
    # Output directories
    json_basedir=output_dir,
    coldata_basedir=coldata_dir,
    # Run options
    reanalyse_existing=True,  # if True, existing colocated data files will be deleted
    raise_exceptions=True,  # if True, the analysis will stop whenever an error occurs
    clear_existing_json=False,  # if True, deletes previous output before running
    # Map Options

    from pyaerocom.aeroval import EvalSetup, ExperimentProcessor
    from pyaerocom import const

    print(
        const.CACHEDIR
    )  # Prints where to find the caching folder. Not needed but this folder should be emptied now and then, so I like to see where it is

    stp = EvalSetup(**CFG)  # Makes a setup object from the dict, that PyAeroval can use
    ana = ExperimentProcessor(stp)  # Makes an experiment object
    res = ana.run()  # Runs the experiment  add_model_maps=False,  # Adds a plot of the whole map. Very slow!!!
    only_model_maps=False,  # Adds only plot above, without any other evaluation
    filter_name="ALL-noMOUNTAINS",  # Regional filter for analysis
    map_zoom="World",  # Zoom level. For EMEP, Europe is typically used
    ts_type="monthly",  # Colocation frequency (no statistics in higher resolution can be computed)
    freqs=["monthly", "yearly"],  # Frequencies that are evaluated
    main_freq="monthly",  # Frequency that is displayed when opening webpage
    periods=[
        "1995-2000"
    ],  # List of years or periods of years that are evaluated. E.g. "2005" or "2001-2020"
    obs_remove_outliers=False,
    model_remove_outliers=False,
    colocate_time=False,
    zeros_to_nan=False,
    weighted_stats=True,
    annual_stats_constrained=True,

    from pyaerocom.aeroval import EvalSetup, ExperimentProcessor
    from pyaerocom import const

    print(
        const.CACHEDIR
    )  # Prints where to find the caching folder. Not needed but this folder should be emptied now and then, so I like to see where it is

    stp = EvalSetup(**CFG)  # Makes a setup object from the dict, that PyAeroval can use
    ana = ExperimentProcessor(stp)  # Makes an experiment object
    res = ana.run()  # Runs the experiment  # Experiment Metadata
    exp_pi=exp_pi,
    proj_id=proj_id,
    exp_id=experiment_id,
    exp_name="DURF test evaluation",
    exp_descr=("Evaluation test DURF 10 year test simulations"),
    public=True,
)

    from pyaerocom.aeroval import EvalSetup, ExperimentProcessor
    from pyaerocom import const

    print(
        const.CACHEDIR
    )  # Prints where to find the caching folder. Not needed but this folder should be emptied now and then, so I like to see where it is

    stp = EvalSetup(**CFG)  # Makes a setup object from the dict, that PyAeroval can use
    ana = ExperimentProcessor(stp)  # Makes an experiment object
    res = ana.run()  # Runs the experiment# CFG['obs_cfg'] = {**OBS_SAT, **OBS_GROUNDBASED}

CFG['obs_cfg'] = {**OBS_GROUNDBASED}

CFG["model_cfg"] = MODELS

if __name__ == "__main__":

    from pyaerocom.aeroval import EvalSetup, ExperimentProcessor
    from pyaerocom import const

    print(
        const.CACHEDIR
    )  # Prints where to find the caching folder. Not needed but this folder should be emptied now and then, so I like to see where it is

    stp = EvalSetup(**CFG)  # Makes a setup object from the dict, that PyAeroval can use
    ana = ExperimentProcessor(stp)  # Makes an experiment object
    res = ana.run()  # Runs the experiment

Expected behavior What pyaerocom should do is to read the obs data for the specified period. Calculate the climatology of specified frequency i.e. either monthly or yearly and it should be assign the same time axis as the model data.

Issues to start fixing:

Ovewh commented 7 months ago

Actually this never properly implemented back to issue #51

Just from reading the old discussion. I do not think that we should have "fixed" period for climatology, but rather have a default one. Especially since 2005 - 2015 is almost 10 years ago and we have new and different observations now, which did not exist back them.