xCDAT / xcdat

An extension of xarray for climate data analysis on structured grids.
https://xcdat.readthedocs.io/en/latest/
Apache License 2.0
113 stars 12 forks source link

augmenting axis fall-back table (#584) #602

Closed durack1 closed 7 months ago

durack1 commented 7 months ago

Description

Adding some additional CMIP6 ocean/thetao vertical coordinate names

@tomvothecoder I hope this is helpful

codecov[bot] commented 7 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (fbf1db6) 100.00% compared to head (9be64c3) 100.00%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #602 +/- ## ========================================= Coverage 100.00% 100.00% ========================================= Files 15 15 Lines 1602 1602 ========================================= Hits 1602 1602 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

tomvothecoder commented 7 months ago

Thanks for this @durack1! It looks good to me! I'll merge this PR.

pochedls commented 7 months ago

One comment on this (already-merged) PR: Are these axis ids that occur in CMIP data or are they generally accepted names? There was a lot of discussion about this functionality when it was added because lat and latitude are widely accepted. Is the same true of deptht?

This is important to consider, because we don't want to overly cater to peculiarities in specific models (for example, latitude can be weird in some cases, something like LAT_0X_DEGREES) because we can't always anticipate what problems that might experience in the future (either an expectation to cater to weird metadata or actual problems because something we've defined universally conflicts with metadata in a dataset we haven't seen).

durack1 commented 7 months ago

One comment on this (already-merged) PR: Are these axis ids that occur in CMIP data or are they generally accepted names? There was a lot of discussion about this functionality when it was added because lat and latitude are widely accepted. Is the same true of deptht?

This is important to consider, because we don't want to overly cater to peculiarities in specific models (for example, latitude can be weird in some cases, something like LAT_0X_DEGREES) because we can't always anticipate what problems that might experience in the future (either an expectation to cater to weird metadata or actual problems because something we've defined universally conflicts with metadata in a dataset we haven't seen).

Good question @pochedls. deptht is an unusual one, as is rho. I've just been expanding my scan across CMIP5 (previously just looked at CMIP6) so will know when that completes.

I just took a peek at the CF Conventions (here) and notably the definitions of vertical coordinates for the ocean are very slim - most are atmospheric examples. So other than the large archives and their examples, at this moment I can't think of another guidepost to point to. Note sure UDUNITS does a better job than CF on coordinates.

Expanding a little. I have personally created a depth (units: decibar) based observational database in the past, which happily reported CF compliance back when 1.6 was the standard. There are numerous oceanographic observational coordinates, variants of density with different reference levels that would be identified as sigmaX which could also be very recognizable to folks familiar. So in brief, it is the wild west.

pochedls commented 7 months ago

Thanks @durack1 (cc @tomvothecoder). We might want to more narrowly define vertical coordinates if most of these datasets define the axis attribute. If axis = Z, xcdat / cf-xarray can figure this out without hard-coding in definitions.

tomvothecoder commented 7 months ago

@pochedls Sure, we can discuss further. I was thinking of tagging you and I probably prematurely merged. We can always rollback this change if we want to narrow the scope of hard-coded definitions to focus on generally accepted axes names.

durack1 commented 7 months ago

@pochedls you raise a good query. I've just taken a peek through the collective CMIP5/6 archives, 116 models with 3D temperature/thetao at monthly temporal resolution. The bulk of models use a lev vertical coordinate, although what this points to (m, pressure, density, ...) vary markedly across the archive. Thankfully, almost all of these 116 models axis = 'Z' seems to work seamlessly, but if we were to roll back to valid CMIP3 data, this may not hold as well. A quick glance yielded a number of the groups did follow the axis convention, though it would surprise me if they all did.

Notes on very few case examples: deptht: css03_esgf_publish ~/CMIP6/CMIP/IPSL/IPSL-CM5A2-INCA/historical/r1i1p1f1/Omon/thetao/gn/v20200729/thetao_Omon_IPSL-CM5A2-INCA_historical_r1i1p1f1_gn_185001-201412.nc rho: css03_esgf_publish ~/CMIP6/CMIP/NCC/NorESM2-MM/historical/r1i1p1f1/Omon/thetao/gn/v20191108/thetao_Omon_NorESM2-MM_historical_r1i1p1f1_gn_185001-185912.nc olevel: css03_esgf_publish ~/CMIP6/CMIP/IPSL/IPSL-CM6A-LR/historical/r1i1p1f1/Omon/thetao/gn/v20180803/thetao_Omon_IPSL-CM6A-LR_historical_r1i1p1f1_gn_185001-194912.nc

It would be useful to also consider model native vertical coordinates when more thoroughly considering this, as I am certain this would be far more variant than the curated CMIPx output. An example close to home, E3SM-2-x (and presumably all versions) use two different coordinates nVertLevels and nVertLevelsP1, neither of which have an axis attribute.

pochedls commented 7 months ago

@durack1 – are there any standard names referenced anywhere for vertical axes? It might be good to open an issue to further consider this.

durack1 commented 7 months ago

@pochedls worth considering. There a a raft of different generalized vertical coordinate schemes in current generation models, but these schemes and their names in output files are mostly separated. lev is the standard axes name for most CMIP6 output, depth comes a very distant second, with everything else being most often single cases. lev and rho was defined for use in CMIP5, in addition to the vertical slice coordinates olayer100m, depth100m, depth0m (see @taylor13's standard_output.pdf) - most of the guidance in this doc was followed relatively faithfully. This also includes all the atmospheric output definitions, mostly lev or plev or the plevX (X = some integer) too. Most of these conventions rolled forward in CMIP6. The use of axis labels Z, Y, X, T was also advocated in this document, and this seems to have been mostly followed.

And just because, here are the CMIP3 guidance notes (here) which share considerable overlap with the CMIP5 guidance, including the axis labels - nice!

pochedls commented 7 months ago

I now have a thread here: https://github.com/xCDAT/xcdat/issues/603