SciTools / iris

A powerful, format-agnostic, and community-driven Python package for analysing and visualising Earth science data
https://scitools-iris.readthedocs.io/en/stable/
BSD 3-Clause "New" or "Revised" License
625 stars 283 forks source link

Derived coordinates are not loaded anymore? #3961

Closed valeriupredoi closed 3 years ago

valeriupredoi commented 3 years ago

📰 Custom Issue

Hey guys, one more for me today, I am using this file and loading it simple:

import iris

cs = iris.load("/home/valeriu/ESMValCore/tests/integration/cmor/_fixes/test_data/common_cl_ap.nc")
for c in cs:
    print(c)

with iris=3 I am not seeing the derived coordinates no more:

bjlittle commented 3 years ago

@valeriupredoi Ohh very interesting, I'll take a peek for you...

bjlittle commented 3 years ago

@valeriupredoi Okay, so this took a wee bit of digging, but the mystery is thankfully solved :detective:

Firstly, I can recreate your issue in iris 3.0.0, and I can also confirm that iris 2.4.0 does indeed realise derived coordinates for your example dataset :+1:

This change in behaviour between iris 3.0.0 and 2.4.0 directly relates to PR #3795, where the units of an AuxCood or DimCoord that is not explicitly specified defaults to unknown rather then 1 (dimensionless).

We thought about this change in behaviour long and hard, as the treatment of default units was inconsistent across iris for different objects, and the CF Metadata standard did not add any clarity (no surprises there, indeed it was inconsistent), but in the end we unified the behaviour for consistency sake. That is, erring on the side that it correct and safer to assign units as unknown in the case when they are not specified, whereas it is potentially incorrect and not safe to assume default dimensionless units instead.

This change in behaviour is mentioned in the iris 3.0.0 release notes in the Incompatible Changes section, fifth bullet point.

Where this is stinging you is for the missing units for the sigma variable associated with the atmosphere_hybrid_sigma_pressure_coordinate i.e., given formula_terms = "ap: ap_bnds b: b_bnds ps: ps", sigma in this case is the NetCDF b variable, and this is defined within the example file simply as double b(lev) ;

As the b variable has no units associated with it iris 3.0.0 assigns the default units of unknown (which they are). However, during the creation of the auxiliary factory that creates the derived coordinate it enforces the contract that the sigma variable must be dimensionless, which in this case it's not, as it's unknown. Note that, enforcing that sigma is dimensionless has not changed from 2.4.0 to 3.0.0.

This failure for sigma units results in iris raising a warning and not creating the associated HybridPressureFactory auxiliary factory i.e., during loading you see the following warning:

UserWarning: Invalid units: sigma must be dimensionless.

Also, note the example given in the CF Metadata Conventions, where their sigma variable has explicit units = "1".

So in order to fix this, my advise is to add the correct units to your sigma variable in the file. In addition to this, you could also add the missing bounds attribute for the sigma (b) and delta (ap) variables e.g.,

netcdf common_cl_ap {
dimensions:
    time = 1 ;
    lev = 2 ;
    lat = 3 ;
    lon = 4 ;
    bnds = 2 ;
variables:
    double time(time) ;
        time:standard_name = "time" ;
        time:units = "days since 6543-2-1" ;
    double lev(lev) ;
        lev:bounds = "lev_bnds" ;
        lev:standard_name = "atmosphere_hybrid_sigma_pressure_coordinate" ;
        lev:units = "1" ;
        lev:formula_terms = "ap: ap b: b ps: ps" ;
    double lev_bnds(lev, bnds) ;
        lev_bnds:standard_name = "atmosphere_hybrid_sigma_pressure_coordinate" ;
        lev_bnds:units = "1" ;
        lev_bnds:formula_terms = "ap: ap_bnds b: b_bnds ps: ps" ;
    double lat(lat) ;
        lat:standard_name = "latitude" ;
        lat:units = "degrees_north" ;
    double lon(lon) ;
        lon:standard_name = "longitude" ;
        lon:units = "degrees_east" ;
    double b(lev) ;
                b:units = "1" ;                                                 <== add "units" for sigma
                b:bounds = "b_bnds" ;                                           <== add "bounds" for sigma
    double b_bnds(lev, bnds) ;
    double ps(time, lat, lon) ;
        ps:standard_name = "surface_air_pressure" ;
        ps:units = "Pa" ;
        ps:additional_attribute = "xyz" ;
    float cl(time, lev, lat, lon) ;
        cl:standard_name = "cloud_area_fraction_in_atmosphere_layer" ;
        cl:units = "%" ;
    double ap(lev) ;
        ap:units = "Pa" ;
                ap:bounds = "ap_bnds" ;                                         <== add "bounds" for delta
    double ap_bnds(lev, bnds) ;
}

Hope this helps :smile:

bjlittle commented 3 years ago

The alternative is that we relax the contract in the auxiliary factories to accept coordinates that are dimensionless or unknown. However that could cause other undesirable side effects or complications e.g., The computation of the derived coordinate would break due to the use of sigma with unknown units (I think) or at least you'd end up with a derived coordinate also with units of unknown.

However, iris could override the unknown units to be dimensionless. This may affect load/save round-tripping... that would need to be investigated. Regardless, this would be an iris 3.0.1 release

bjlittle commented 3 years ago

@valeriupredoi Done a quick hack to see, if in principle, this promotion of units of unknown to dimensionless (1) within auxiliary factories works...

Python 3.7.8 | packaged by conda-forge | (default, Nov 17 2020, 23:45:15) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.12.0 -- An enhanced Interactive Python. Type '?' for help.
PyDev console: using IPython 7.12.0
Python 3.7.8 | packaged by conda-forge | (default, Nov 17 2020, 23:45:15) 
[GCC 7.5.0] on linux
>>> import iris
>>> iris.__version__
'3.0.0'
>>> fname = "/downloads/common_cl_ap.nc"
>>> cubes = iris.load(fname)
/net/home/h05/itwl/projects/git/iris/lib/iris/fileformats/cf.py:1207: UserWarning: Ignoring formula terms variable 'ps' referenced by data variable 'ap_bnds' via variable 'lev': Dimensions ('time', 'lat', 'lon') do not span ('lev', 'bnds')
  warnings.warn(msg)
/net/home/h05/itwl/projects/git/iris/lib/iris/fileformats/cf.py:1207: UserWarning: Ignoring formula terms variable 'ps' referenced by data variable 'b_bnds' via variable 'lev': Dimensions ('time', 'lat', 'lon') do not span ('lev', 'bnds')
  warnings.warn(msg)
/net/home/h05/itwl/projects/git/iris/lib/iris/fileformats/netcdf.py:685: UserWarning: Unable to find coordinate for variable 'ps'
  "{!r}".format(name)
/net/home/h05/itwl/projects/git/iris/lib/iris/fileformats/netcdf.py:685: UserWarning: Unable to find coordinate for variable 'ps'
  "{!r}".format(name)
>>> print(cubes)
0: ap_bnds / (unknown)                 (atmosphere_hybrid_sigma_pressure_coordinate: 2; -- : 2)
1: b_bnds / (unknown)                  (atmosphere_hybrid_sigma_pressure_coordinate: 2; -- : 2)
2: cloud_area_fraction_in_atmosphere_layer / (%) (time: 1; atmosphere_hybrid_sigma_pressure_coordinate: 2; latitude: 3; longitude: 4)
3: surface_air_pressure / (Pa)         (time: 1; latitude: 3; longitude: 4)
print(cubes[2])
cloud_area_fraction_in_atmosphere_layer    (time: 1; atmosphere_hybrid_sigma_pressure_coordinate: 2; latitude: 3; longitude: 4)
     Dimension coordinates:
          time                                            x                                               -            -             -
          atmosphere_hybrid_sigma_pressure_coordinate     -                                               x            -             -
          latitude                                        -                                               -            x             -
          longitude                                       -                                               -            -             x
     Auxiliary coordinates:
          surface_air_pressure                            x                                               -            x             x
          ap                                              -                                               x            -             -
          b                                               -                                               x            -             -
     Derived coordinates:
          air_pressure                                    x                                               x            x             x

seems possible

bjlittle commented 3 years ago

@bouweandela and @valeriupredoi How much of a deal breaker is this for your guys? I'm assuming an iris 3.0.1 would be preferable?

valeriupredoi commented 3 years ago

@bjlittle cheers for the super quick and detailed answers, mate! We appreciate very much your very quick response times to issues we raise :beer: About this, I am afraid it generates a chicken-and-egg problem - the chicken being faulty model data that is already on the ESGF and the egg is us needing to fix it via our CMOR fixes before we put it through the analysis. The file I linked above is a replication of such a faulty model data file that we use in a test of a fix: if we can't load the derived coordinates it means we can't fix them in ESMValCore and that means we will not be able to run ESMValTool on it, well, at least without not turning off all the CMOR checks that, in turn, will generate a whole lot of other headaches. So yeah, I think this is something we kind of really need :grin:

bjlittle commented 3 years ago

@valeriupredoi Okay, no worries dude. Leave it with me and I'll get back to you asap... just need to speak to the team about this :+1:

valeriupredoi commented 3 years ago

@valeriupredoi Okay, no worries dude. Leave it with me and I'll get back to you asap... just need to speak to the team about this +1

you're legend @bjlittle cheers muchly! :beer:

bjlittle commented 3 years ago

@valeriupredoi Okay, I'm dropping everything today to service this - I can totally appreciate the situation you're in, so I'm going to push an iris 3.0.1 release today.

Hopefully, I can turn this around quickly for you, as I know you're working to a tight deadline.

I'll ping you here @valeriupredoi with my progress on this to keep you in the loop :+1:

valeriupredoi commented 3 years ago

@bjlittle that's fantastic, sorry to have done this to you :grin: We are planning on frezing ESMValCore on February 1st, as per our release schedule, but the freeze lasts a week and in extraordinary circumstances we can still push to main during that week - so - this would be good to be in so we can push iris=3 in the release, but that means it's not a race to the sea kind of thing. @jvegasbsc is the release manager this time around - heads up, Javi :+1: Cheers muchly for your efforts, Bill and team! :beer:

bouweandela commented 3 years ago

@bouweandela and @valeriupredoi How much of a deal breaker is this for your guys? I'm assuming an iris 3.0.1 would be preferable?

@schlunma Could you comment on this?

schlunma commented 3 years ago

From a quick look at the issue I'm pretty sure that we can easily work around this issue in our fixes by setting the units to '1' like @bjlittle suggested (we actually do that in one example). I'll can have a detailed look at this later in the afternoon today.

By the way I really like the new behavior of iris (using unknown as units if no unit is specified), since I had troubles with the old behavior before :+1:

valeriupredoi commented 3 years ago

From a quick look at the issue I'm pretty sure that we can easily work around this issue in our fixes by setting the units to '1' like @bjlittle suggested (we actually do that in one example). I'll can have a detailed look at this later in the afternoon today.

yes, but the coordinate is not loaded at load point!

schlunma commented 3 years ago

The particular file here is so messed up that we have to fix it with netCDF4 anyway, so we can do it with that.

Nevertheless, even though the derived coordinate is not there we should be able to add it after loading with iris.cube.Cube.add_aux_factory.

valeriupredoi commented 3 years ago

how about the case where the file is a model file off the ESGF and needs fixes - this very case that we test? Don't assume any level of clean data, you have seen what messed up files can still be out there on ESGF

bjlittle commented 3 years ago

@schlunma seems like a reasonable work around, but it's really causing you guys quite a lot of extra work to jump through hoops.

I've got a fix that I'm testing just now that would gracefully promote any formula terms variable that must be dimensionless to be dimensionless if it has units of unknown - it's doing this within the aux factory itself.

Also, this behavioural change applies to 6 aux factories - so for me, it makes more sense to fix this in iris rather than force this upon all users.

I'm going to roll with this fix regardlessly, as my expectation is that this will affect many users.

Question: do you have any other ESMValTool netCDF files that also show this failure that I can test my fix with?

bjlittle commented 3 years ago

If this fix is good, and there are no other side-effects, then I believe that i can push an iris 3.0.1 release today and get it on conda-forge for you guys.

I have @trexfeathers on hand as a review buddy to expedite the review and release.

valeriupredoi commented 3 years ago

@bjlittle hang on, I'm rerunning the tests, try find more files, if @schlunma doesn't beat me to it :)

schlunma commented 3 years ago

@bjlittle Thanks, that sounds like a great solution!

bjlittle commented 3 years ago

@schlunma Cool, let's roll with that then.

If you guys find any extra "real world messed up" files that hit the same problem, then share them with me so that I can check this fix is good before I tag a release - just to give us extra confidence.

bjlittle commented 3 years ago

cirrus-ci is certainly happy with the fix i.e., no side-effects that our testing coverage can pick up... that's encouraging (or we have a hole in our test coverage :laughing:)

I'm just going to check our unit tests and ensure there is test coverage for at least this change... then we should be good to release.

valeriupredoi commented 3 years ago

massive :clap: @bjlittle - unfortunately both @schlunma and me couldn't find anymore of these messed files, but there is a good probability there may be out there, given that these are like ants - once you see one they're bound to be more

bjlittle commented 3 years ago

No worries, that's not a problem. Thanks for checking, much appreciated :+1:

This has turned out to be such a simple change, I'm pretty confident that similar files with no units attributes on NetCDF variables that should have explicit units = 1 (dimensionless) will work (pride before a fall :face_with_head_bandage:)

schlunma commented 3 years ago

Ha! Found one: /CMIP6/CMIP/BCC/BCC-ESM1/historical/r1i1p1f1/Amon/cl/gn/v20181217/cl_Amon_BCC-ESM1_historical_r1i1p1f1_gn_185001-201412.nc

netcdf cl_Amon_BCC-ESM1_historical_r1i1p1f1_gn_185001-201412 {
dimensions:
        time = UNLIMITED ; // (1980 currently)
        lev = 26 ;
        lat = 64 ;
        lon = 128 ;
        bnds = 2 ;
variables:
        double time(time) ;
                time:bounds = "time_bnds" ;
                time:units = "days since 1850-01-01" ;
                time:calendar = "365_day" ;
                time:axis = "T" ;
                time:long_name = "time" ;
                time:standard_name = "time" ;
        double time_bnds(time, bnds) ;
        double lev(lev) ;
                lev:bounds = "lev_bnds" ;
                lev:units = "1" ;
                lev:axis = "Z" ;
                lev:positive = "down" ;
                lev:long_name = "hybrid sigma pressure coordinate" ;
                lev:standard_name = "atmosphere_hybrid_sigma_pressure_coordinate" ;
                lev:formula = "p = a*p0 + b*ps" ;
                lev:formula_terms = "p0: p0 a: a b: b ps: ps" ;
        double lev_bnds(lev, bnds) ;
                lev_bnds:formula = "p = a*p0 + b*ps" ;
                lev_bnds:standard_name = "atmosphere_hybrid_sigma_pressure_coordinate" ;
                lev_bnds:units = "1" ;
                lev_bnds:formula_terms = "p0: p0 a: a_bnds b: b_bnds ps: ps" ;
        double p0 ;
                p0:long_name = "vertical coordinate formula term: reference pressure" ;
                p0:units = "Pa" ;
        double a(lev) ;
                a:long_name = "vertical coordinate formula term: a(k)" ;
        double b(lev) ;
                b:long_name = "vertical coordinate formula term: b(k)" ;
        float ps(time, lat, lon) ;
                ps:long_name = "Surface Air Pressure" ;
                ps:units = "Pa" ;
        double a_bnds(lev, bnds) ;
                a_bnds:long_name = "vertical coordinate formula term: a(k+1/2)" ;
        double b_bnds(lev, bnds) ;
                b_bnds:long_name = "vertical coordinate formula term: b(k+1/2)" ;
        double lat(lat) ;
                lat:bounds = "lat_bnds" ;
                lat:units = "degrees_north" ;
                lat:axis = "Y" ;
                lat:long_name = "latitude" ;
                lat:standard_name = "latitude" ;
        double lat_bnds(lat, bnds) ;
        double lon(lon) ;
                lon:bounds = "lon_bnds" ;
                lon:units = "degrees_east" ;
                lon:axis = "X" ;
                lon:long_name = "Longitude" ;
                lon:standard_name = "longitude" ;
        double lon_bnds(lon, bnds) ;
        float cl(time, lev, lat, lon) ;
                cl:standard_name = "cloud_area_fraction_in_atmosphere_layer" ;
                cl:long_name = "Cloud Area Fraction" ;
                cl:comment = "Percentage cloud cover, including both large-scale and convective cloud." ;
                cl:units = "%" ;
                cl:original_name = "CLOUD" ;
                cl:cell_methods = "area: time: mean (interval: 20 minutes)" ;
                cl:cell_measures = "area: areacella" ;
                cl:missing_value = 1.e+20f ;
                cl:_FillValue = 1.e+20f ;
                cl:history = "2018-12-17T02:20:59Z altered by CMOR: Inverted axis: lev." ;
bjlittle commented 3 years ago

@schlunma Awesome find :eyes: ... I'll test with that now... hang tight...

bjlittle commented 3 years ago

@schlunma Dah! Where does that file exist? Can you provide a link so that I can download it...

I though that it might live in https://github.com/ESMValGroup/ESMValCore/tree/master/tests/integration/cmor/_fixes/test_data

valeriupredoi commented 3 years ago

told you they're like ants :laughing: That's on the ESGF node (at DKRZ I assume, since Manu works on that one), let me try grab it off CEDA-JASMIN and put it somewhere where you can download it

schlunma commented 3 years ago

Thanks V.!

I found another one with hybrid height: MOHC/UKESM1-0-LL/historical/r1i1p1f2/Amon/cl/gn/v20190406/cl_Amon_UKESM1-0-LL_historical_r1i1p1f2_gn_185001-189912.nc.

schlunma commented 3 years ago

You can check cl of all CMIP6 models, I guess there's more of them. I have a meeting now, can help you guys later! :+1:

valeriupredoi commented 3 years ago

I've put two of those files on JASMIN at /home/users/valeriu/badData which is open to all JASMIN users, unfortunately the local server that I use to get into JASMIN is read-only today (as it happens, when you need something done), and I can't scp them anywhere w/o first going through it - any chance any of yous can grab them straight from my JASMIN location?

bjlittle commented 3 years ago

@valeriupredoi Nuts. Sorry, I don't have access to JASMIN.

Hmmm is it possible for you to create a new personal GitHub repo and upload the files into there? Then I can get them, and afterwards simply nuke that GitHub repo when we're finished...

bouweandela commented 3 years ago

@bjlittle You can just download the files directly from ESGF:

wget http://esgf3.dkrz.de/thredds/fileServer/cmip6/CMIP/BCC/BCC-ESM1/historical/r1i1p1f1/Amon/cl/gn/v20181217/cl_Amon_BCC-ESM1_historical_r1i1p1f1_gn_185001-201412.nc
wget http://esgf3.dkrz.de/thredds/fileServer/cmip6/CMIP/MOHC/UKESM1-0-LL/historical/r1i1p1f2/Amon/cl/gn/v20190406/cl_Amon_UKESM1-0-LL_historical_r1i1p1f2_gn_185001-189912.nc
valeriupredoi commented 3 years ago

yes, that's what I just did, but I had to chop them (in iris2.4) otherwise they'd be 2G >> 100M max upload limit :grin: Here's the first, am adding the second in a jiffy https://github.com/valeriupredoi/badData/blob/main/cl_Amon_BCC-ESM1_chopped.nc

valeriupredoi commented 3 years ago

and here's the second file (complete with gitHub barking at me it's more than 50M hehe) https://github.com/valeriupredoi/badData/blob/main/cl_Amon_UKESM1-0-LL_chopped.nc - chopping is on time (I kept only the first 10 time points, all else should be there)

valeriupredoi commented 3 years ago

or @bouweandela 's solution which is elegant, I must confess I have never downloaded from ESGF :grin:

bjlittle commented 3 years ago

Thanks guys, downloading now...

bjlittle commented 3 years ago

Blimey it's a sloooooow connection... :sleeping:

valeriupredoi commented 3 years ago

can grab me chopped files, they're much smaller :+1:

bouweandela commented 3 years ago

These are opendap urls, so you can also load the files directly without downloading, maybe that's faster?

import iris
cubes = iris.load('http://esgf3.dkrz.de/thredds/fileServer/cmip6/CMIP/BCC/BCC-ESM1/historical/r1i1p1f1/Amon/cl/gn/v20181217/cl_Amon_BCC-ESM1_historical_r1i1p1f1_gn_185001-201412.nc')
bjlittle commented 3 years ago

I've got one file, the other is almost there... so glad we did this as the file cl_Amon_BCC-ESM1_historical_r1i1p1f1_gn_185001-201412 contains the alternative formula_terms = "p0: p0 a: a b: b ps: ps" for a atmosphere_hybrid_sigma_pressure_coordinate, where ap (delta) is formed by performing p0 * a.

In the example file a has no units and p0 has units of Pa (which does and must match ps), however the p0 * a results in a delta with units of unknown and causes the same failure, but in a different way.

I've located the fix in iris/fileformats/netcdf.py before it then creates the aux factory, as it's too late for the patch in the aux factory to fix this particular nuance.

Re-testing now...

bjlittle commented 3 years ago

Tis a beautiful thing...

>>> import iris
>>> fname = "/downloads/cl_Amon_BCC-ESM1_historical_r1i1p1f1_gn_185001-201412.nc"
>>> cubes = iris.load(fname)
/net/home/h05/itwl/projects/git/iris/lib/iris/fileformats/cf.py:862: UserWarning: Missing CF-netCDF measure variable 'areacella', referenced by netCDF variable 'cl'
  message % (variable_name, nc_var_name)
/net/home/h05/itwl/projects/git/iris/lib/iris/fileformats/cf.py:1207: UserWarning: Ignoring formula terms variable 'ps' referenced by data variable 'b_bnds' via variable 'lev': Dimensions ('time', 'lat', 'lon') do not span ('lev', 'bnds')
  warnings.warn(msg)
/net/home/h05/itwl/projects/git/iris/lib/iris/fileformats/cf.py:1207: UserWarning: Ignoring formula terms variable 'ps' referenced by data variable 'a_bnds' via variable 'lev': Dimensions ('time', 'lat', 'lon') do not span ('lev', 'bnds')
  warnings.warn(msg)
/net/home/h05/itwl/projects/git/iris/lib/iris/fileformats/netcdf.py:685: UserWarning: Unable to find coordinate for variable 'ps'
  "{!r}".format(name)
/net/home/h05/itwl/projects/git/iris/lib/iris/fileformats/netcdf.py:685: UserWarning: Unable to find coordinate for variable 'ps'
  "{!r}".format(name)

>>> cubes
[<iris 'Cube' of vertical coordinate formula term: b(k+1/2) / (unknown) (atmosphere_hybrid_sigma_pressure_coordinate: 26; -- : 2)>,
<iris 'Cube' of vertical coordinate formula term: a(k+1/2) / (unknown) (atmosphere_hybrid_sigma_pressure_coordinate: 26; -- : 2)>,
<iris 'Cube' of Surface Air Pressure / (Pa) (time: 1980; latitude: 64; longitude: 128)>,
<iris 'Cube' of cloud_area_fraction_in_atmosphere_layer / (%) (time: 1980; atmosphere_hybrid_sigma_pressure_coordinate: 26; latitude: 64; longitude: 128)>]

>>> print(cubes[3])
cloud_area_fraction_in_atmosphere_layer             (time: 1980; atmosphere_hybrid_sigma_pressure_coordinate: 26; latitude: 64; longitude: 128)
     Dimension coordinates:
          time                                                     x                                                  -             -              -
          atmosphere_hybrid_sigma_pressure_coordinate              -                                                  x             -              -
          latitude                                                 -                                                  -             x              -
          longitude                                                -                                                  -             -              x
     Auxiliary coordinates:
          Surface Air Pressure                                     x                                                  -             x              x
          vertical coordinate formula term: a(k)                   -                                                  x             -              -
          vertical coordinate formula term: b(k)                   -                                                  x             -              -
          vertical pressure                                        -                                                  x             -              -
     Derived coordinates:
          air_pressure                                             x                                                  x             x              x
     Scalar coordinates:
          vertical coordinate formula term: reference pressure: 100000.0 Pa
     Attributes:
          Conventions: CF-1.7 CMIP-6.2
          activity_id: CMIP
          branch_method: Standard
          branch_time_in_child: 0.0
          branch_time_in_parent: 2110.0
          cmor_version: 3.3.2
          comment: Percentage cloud cover, including both large-scale and convective clou...
          contact: Dr. Tongwen Wu(twwu@cma.gov.cn)
          creation_date: 2018-12-17T02:20:59Z
          data_specs_version: 01.00.27
          description: DECK: historical
          experiment: all-forcing simulation of the recent past
          experiment_id: historical
          external_variables: areacella
          forcing_index: 1
          frequency: mon
          further_info_url: https://furtherinfo.es-doc.org/CMIP6.BCC.BCC-ESM1.historical.none.r1i1...
          grid: T42
          grid_label: gn
          history: 2018-12-17T02:20:59Z altered by CMOR: Inverted axis: lev.
          initialization_index: 1
          institution: Beijing Climate Center, Beijing 100081, China
          institution_id: BCC
          license: CMIP6 model data produced by BCC is licensed under a Creative Commons Attribution...
          mip_era: CMIP6
          nominal_resolution: 250 km
          original_name: CLOUD
          parent_activity_id: CMIP
          parent_experiment_id: piControl
          parent_mip_era: CMIP6
          parent_source_id: BCC-ESM1
          parent_time_units: days since 1850-01-01
          parent_variant_label: r1i1p1f1
          physics_index: 1
          product: model-output
          realization_index: 1
          realm: atmos
          references: Model described by Tongwen Wu et al. (JGR 2013; JMR 2014; submmitted to...
          run_variant: forcing: greenhouse gases,aerosol emission,solar constant,volcano mass,land...
          source: BCC-ESM 1 (2017):   aerosol: none  atmos: BCC_AGCM3_LR (T42; 128 x 64 longitude/latitude;...
          source_id: BCC-ESM1
          source_type: AER AOGCM CHEM
          sub_experiment: none
          sub_experiment_id: none
          table_id: Amon
          table_info: Creation Date:(30 July 2018) MD5:e53ff52009d0b97d9d867dc12b6096c7
          title: BCC-ESM1 output prepared for CMIP6
          tracking_id: hdl:21.14100/6f5b873f-9609-4be8-99d6-bfde65468e67
          variable_id: cl
          variant_label: r1i1p1f1
     Cell methods:
          mean: area (20 minutes), time (20 minutes)
bjlittle commented 3 years ago

Just checking the second file cl_Amon_UKESM1-0-LL_historical_r1i1p1f2_gn_185001-189912.nc...

bjlittle commented 3 years ago

Sweet, this looks good too...

>>> fname2 = "/downloads/cl_Amon_UKESM1-0-LL_historical_r1i1p1f2_gn_185001-189912.nc"
>>> cubes = iris.load(fname)
/net/home/h05/itwl/projects/git/iris/lib/iris/fileformats/cf.py:862: UserWarning: Missing CF-netCDF measure variable 'areacella', referenced by netCDF variable 'cl'
  message % (variable_name, nc_var_name)
/net/home/h05/itwl/projects/git/iris/lib/iris/fileformats/cf.py:1207: UserWarning: Ignoring formula terms variable 'ps' referenced by data variable 'b_bnds' via variable 'lev': Dimensions ('time', 'lat', 'lon') do not span ('lev', 'bnds')
  warnings.warn(msg)
/net/home/h05/itwl/projects/git/iris/lib/iris/fileformats/cf.py:1207: UserWarning: Ignoring formula terms variable 'ps' referenced by data variable 'a_bnds' via variable 'lev': Dimensions ('time', 'lat', 'lon') do not span ('lev', 'bnds')
  warnings.warn(msg)
/net/home/h05/itwl/projects/git/iris/lib/iris/fileformats/netcdf.py:685: UserWarning: Unable to find coordinate for variable 'ps'
  "{!r}".format(name)
/net/home/h05/itwl/projects/git/iris/lib/iris/fileformats/netcdf.py:685: UserWarning: Unable to find coordinate for variable 'ps'
  "{!r}".format(name)

>>> print(cubes)
0: vertical coordinate formula term: b(k+1/2) / (unknown) (atmosphere_hybrid_sigma_pressure_coordinate: 26; -- : 2)
1: vertical coordinate formula term: a(k+1/2) / (unknown) (atmosphere_hybrid_sigma_pressure_coordinate: 26; -- : 2)
2: Surface Air Pressure / (Pa)         (time: 1980; latitude: 64; longitude: 128)
3: cloud_area_fraction_in_atmosphere_layer / (%) (time: 1980; atmosphere_hybrid_sigma_pressure_coordinate: 26; latitude: 64; longitude: 128)

>>> print(cubes[3])
cloud_area_fraction_in_atmosphere_layer             (time: 1980; atmosphere_hybrid_sigma_pressure_coordinate: 26; latitude: 64; longitude: 128)
     Dimension coordinates:
          time                                                     x                                                  -             -              -
          atmosphere_hybrid_sigma_pressure_coordinate              -                                                  x             -              -
          latitude                                                 -                                                  -             x              -
          longitude                                                -                                                  -             -              x
     Auxiliary coordinates:
          Surface Air Pressure                                     x                                                  -             x              x
          vertical coordinate formula term: a(k)                   -                                                  x             -              -
          vertical coordinate formula term: b(k)                   -                                                  x             -              -
          vertical pressure                                        -                                                  x             -              -
     Derived coordinates:
          air_pressure                                             x                                                  x             x              x
     Scalar coordinates:
          vertical coordinate formula term: reference pressure: 100000.0 Pa
     Attributes:
          Conventions: CF-1.7 CMIP-6.2
          activity_id: CMIP
          branch_method: Standard
          branch_time_in_child: 0.0
          branch_time_in_parent: 2110.0
          cmor_version: 3.3.2
          comment: Percentage cloud cover, including both large-scale and convective clou...
          contact: Dr. Tongwen Wu(twwu@cma.gov.cn)
          creation_date: 2018-12-17T02:20:59Z
          data_specs_version: 01.00.27
          description: DECK: historical
          experiment: all-forcing simulation of the recent past
          experiment_id: historical
          external_variables: areacella
          forcing_index: 1
          frequency: mon
          further_info_url: https://furtherinfo.es-doc.org/CMIP6.BCC.BCC-ESM1.historical.none.r1i1...
          grid: T42
          grid_label: gn
          history: 2018-12-17T02:20:59Z altered by CMOR: Inverted axis: lev.
          initialization_index: 1
          institution: Beijing Climate Center, Beijing 100081, China
          institution_id: BCC
          license: CMIP6 model data produced by BCC is licensed under a Creative Commons Attribution...
          mip_era: CMIP6
          nominal_resolution: 250 km
          original_name: CLOUD
          parent_activity_id: CMIP
          parent_experiment_id: piControl
          parent_mip_era: CMIP6
          parent_source_id: BCC-ESM1
          parent_time_units: days since 1850-01-01
          parent_variant_label: r1i1p1f1
          physics_index: 1
          product: model-output
          realization_index: 1
          realm: atmos
          references: Model described by Tongwen Wu et al. (JGR 2013; JMR 2014; submmitted to...
          run_variant: forcing: greenhouse gases,aerosol emission,solar constant,volcano mass,land...
          source: BCC-ESM 1 (2017):   aerosol: none  atmos: BCC_AGCM3_LR (T42; 128 x 64 longitude/latitude;...
          source_id: BCC-ESM1
          source_type: AER AOGCM CHEM
          sub_experiment: none
          sub_experiment_id: none
          table_id: Amon
          table_info: Creation Date:(30 July 2018) MD5:e53ff52009d0b97d9d867dc12b6096c7
          title: BCC-ESM1 output prepared for CMIP6
          tracking_id: hdl:21.14100/6f5b873f-9609-4be8-99d6-bfde65468e67
          variable_id: cl
          variant_label: r1i1p1f1
     Cell methods:
          mean: area (20 minutes), time (20 minutes)
valeriupredoi commented 3 years ago
     Derived coordinates:
          air_pressure                                             x

yay! welcome to the absolute nightmare on ESGF street @bjlittle :rofl:

bjlittle commented 3 years ago

Okay, I think we're good guys :+1:

I defo need to add some unit test coverage for these changes, but that shouldn't take too long.

I've already updated the docs for the v3.0.1 patch release, bumped the version et al.

Once I nail the tests, I'll try to expedite this through review. Tag a release, and the conda-forge bot should automagically create a PR on the feedstock. So iris 3.0.1 should be up on conda-forge in a couple of hours from now :crossed_fingers:

If any gotchas happen between now and then I'll let you know here asap

schlunma commented 3 years ago

The UKESM model should have an altitude derived coordinate, that's weird!

EDIT: You need to use fname2 instead of fname in the beginning :smile:

valeriupredoi commented 3 years ago

that's terrific! Thanks very much @bjlittle - a very nice example of cross-package collaboration when you have cool devvs on both sides, remind me to buy you a :beer: when all this mess is over (the pandemic not the ESGF data issues, the latter one won't be over soon) :grin:

bjlittle commented 3 years ago

LOL what a muppet... hang tight...

bjlittle commented 3 years ago

Great spot @schlunma, thanks... here we go:

>>> fname2 = "/downloads/cl_Amon_UKESM1-0-LL_historical_r1i1p1f2_gn_185001-189912.nc"
>>> cubes = iris.load(fname2)
/net/home/h05/itwl/projects/git/iris/lib/iris/fileformats/cf.py:862: UserWarning: Missing CF-netCDF measure variable 'areacella', referenced by netCDF variable 'cl'
  message % (variable_name, nc_var_name)
/net/home/h05/itwl/projects/git/iris/lib/iris/fileformats/cf.py:1207: UserWarning: Ignoring formula terms variable 'orog' referenced by data variable 'b_bnds' via variable 'lev': Dimensions ('lat', 'lon') do not span ('lev', 'bnds')
  warnings.warn(msg)
/net/home/h05/itwl/projects/git/iris/lib/iris/fileformats/netcdf.py:685: UserWarning: Unable to find coordinate for variable 'orog'
  "{!r}".format(name)

>>> print(cubes)
0: vertical coordinate formula term: b(k+1/2) / (unknown) (atmosphere_hybrid_height_coordinate: 85; -- : 2)
1: Surface Altitude / (m)              (latitude: 144; longitude: 192)
2: cloud_area_fraction_in_atmosphere_layer / (%) (time: 600; atmosphere_hybrid_height_coordinate: 85; latitude: 144; longitude: 192)

>>> print(cubes[2])
cloud_area_fraction_in_atmosphere_layer / (%) (time: 600; atmosphere_hybrid_height_coordinate: 85; latitude: 144; longitude: 192)
     Dimension coordinates:
          time                                     x                                         -             -               -
          atmosphere_hybrid_height_coordinate      -                                         x             -               -
          latitude                                 -                                         -             x               -
          longitude                                -                                         -             -               x
     Auxiliary coordinates:
          vertical coordinate formula term: b(k)   -                                         x             -               -
          Surface Altitude                         -                                         -             x               x
     Derived coordinates:
          altitude                                 -                                         x             x               x
     Attributes:
          Conventions: CF-1.7 CMIP-6.2
          activity_id: CMIP
          branch_method: standard
          branch_time_in_child: 0.0
          branch_time_in_parent: 144000.0
          cmor_version: 3.4.0
          comment: Percentage cloud cover, including both large-scale and convective clou...
          creation_date: 2019-04-05T15:57:16Z
          cv_version: 6.2.20.1
          data_specs_version: 01.00.29
          experiment: all-forcing simulation of the recent past
          experiment_id: historical
          external_variables: areacella
          forcing_index: 2
          frequency: mon
          further_info_url: https://furtherinfo.es-doc.org/CMIP6.MOHC.UKESM1-0-LL.historical.none....
          grid: Native N96 grid; 192 x 144 longitude/latitude
          grid_label: gn
          history: 2019-04-05T15:57:16Z altered by CMOR: Converted units from '1' to '%'....
          initialization_index: 1
          institution: Met Office Hadley Centre, Fitzroy Road, Exeter, Devon, EX1 3PB, UK
          institution_id: MOHC
          license: CMIP6 model data produced by the Met Office Hadley Centre is licensed under...
          mip_era: CMIP6
          mo_runid: u-bc179
          nominal_resolution: 250 km
          original_name: mo: (stash: m01s02i261, lbproc: 128)
          original_units: 1
          parent_activity_id: CMIP
          parent_experiment_id: piControl
          parent_mip_era: CMIP6
          parent_source_id: UKESM1-0-LL
          parent_time_units: days since 1850-01-01-00-00-00
          parent_variant_label: r1i1p1f2
          physics_index: 1
          product: model-output
          realization_index: 1
          realm: atmos
          source: UKESM1.0-LL (2018): 
aerosol: UKCA-GLOMAP-mode
atmos: MetUM-HadGEM3-GA7.1...
          source_id: UKESM1-0-LL
          source_type: AOGCM AER BGC CHEM
          sub_experiment: none
          sub_experiment_id: none
          table_id: Amon
          table_info: Creation Date:(13 December 2018) MD5:2b12b5db6db112aa8b8b0d6c1645b121
          title: UKESM1-0-LL output prepared for CMIP6
          tracking_id: hdl:21.14100/7507e315-b264-43ff-a7c6-479946c0a033
          variable_id: cl
          variant_label: r1i1p1f2
     Cell methods:
          mean: area, time
valeriupredoi commented 3 years ago

do you still need my chopped files on GitHub @bjlittle - or I can remove them? Might get Github to banish me for uploading junk :grin: