bopen / c3s-eqc-toolbox-template

CADS Toolbox template application
Apache License 2.0
5 stars 4 forks source link

[CMIP6] add maps to sea ice diagnostics notebook #119

Closed tdcwilliams closed 9 months ago

tdcwilliams commented 10 months ago

Notebook description

add maps for visualization of results and differences between CMIP6 and observations. @malmans2, this continues the discussion in #116

Notebook link or upload

cmip6_sea_ice_diagnostics_maps_clean.ipynb.zip

Anything else we need to know?

In this notebook (copied and cleaned a messier notebook) it is not happening but in the other version the maps are zoomed out too much so I can't see any details in the SIC.

Environment

wp4 environment

tdcwilliams commented 10 months ago

Hi @malmans2 I added some code that does what I need here: cmip6_notebooks.zip

It might be possible to optimise or improve it, but it works.

Also note I removed era5 from the notebooks since its sea ice concentration was a model input so it wasn't very useful to include it after all.

Maybe you could try to work on #103 starting from the evaluation notebook I've attached, and you might see ways to improve the code while adapting it for that issue?

Thanks a lot for all the help last week! Tim

malmans2 commented 9 months ago

Hi @tdcwilliams. Happy new year!

Sorry about the delay on this issue. I made a template to produce sea ice concentration maps.

Note that I decided to compute monthly averages first (groupby("time.month").mean()), then I computed the overall mean (.mean("month")). I'm doing that to make sure that missing data do not affect the overall mean. Let me know if you think that it's the right way to handle this. Here is the relevant function:

def compute_interpolated_sic(ds, period, region=None, **regrid_kwargs):
    # Get sic
    sic = ds.cf["sea_ice_area_fraction"]
    sic = sic.assign_coords({coord: ds[coord] for coord in ("longitude", "latitude")})

    # Deal with missing values
    mask = (sic.notnull() & (sic != 0)).any(tuple(set(sic.dims) - {"time"}))
    sic = sic.where(mask.compute(), drop=True)
    sic = sic.groupby("time.month").mean(keep_attrs=True)

    # Period mean
    sic = sic.mean("month", keep_attrs=True)

    # Regrid
    if regrid_kwargs:
        sic = interpolate_to_satellite_grid(sic, region=region, **regrid_kwargs)

    # Units
    if sic.attrs.get("units", "") == "(0 - 1)":
        sic *= 100
    sic.attrs["units"] = "%"

    # Interpolation
    return sic.to_dataset(name="sic")

If this template is OK, I think we can easily edit the notebook to address #103 (we just need to use thickness instead of concentration). Let me know!

tdcwilliams commented 9 months ago

Hi @malmans2, happy new year to you too!

I don't totally understand why taking two means works, but it makes sense to not let missing data go into the means.

Comments for plotting:

malmans2 commented 9 months ago

I don't totally understand why taking two means works, but it makes sense to not let missing data go into the means.

For example, if we have a whole month missing, and that month in reality has a lot of ice, I think the overall result would be biased low if we don't do the first step. groupby("time.month").mean() returns 12 means (one for each month). When we take the average of that, the bias of sporadic missing months should be lower.

I'll work on the improvements tomorrow.

malmans2 commented 9 months ago

(please don't do pip install on the VM. We have a workflow to update the environments so we always keep track of the dependencies installed and we can reproduce old notebooks even if libraries introduce breaking changes)

malmans2 commented 9 months ago

cmocean is now installed

tdcwilliams commented 9 months ago

great, thanks @malmans2

malmans2 commented 9 months ago

Hi @tdcwilliams,

I think I've applied all your suggestions.

However, I'd prefer to keep the timeseries and maps notebooks separate. Jupyter notebooks are great to share code+results, but are not that easy to maintain. Keeping different functionalities in separate notebook makes it easier for us to maintain the templates.

The notebooks in this repo are just templates though, if you want both features in your notebook, all you have to do is to copy and paste the template cells in you notebook. We did something very similar for the satellite_esacci notebooks in wp5: satellite_esacci_gmpe_timeseries.ipynb, satellite_esacci_gmpe_trends.ipynb, satellite_esacci_gmpe_variability.ipynb

What do you think?

tdcwilliams commented 9 months ago

Hi @malmans2, the notebook looks good. If separating functionalities is easier for you to maintain the templates that's fine. Thanks a lot.

malmans2 commented 9 months ago

Great. Closing this. I should be able to work on #103 this week.