PCMDI / cmip6-cmor-tables

JSON Tables for CMOR3 to create CMIP6 dataset
BSD 3-Clause "New" or "Revised" License
31 stars 46 forks source link

Question: valid longitude ranges #133

Closed matthew-mizielinski closed 6 years ago

matthew-mizielinski commented 6 years ago

The CMIP6_grids file defines longitude as having a valid range of 0 to 360 degrees. When processing data that has a longitude range from -180 to 180 CMOR provides a warning, but does not raise an error and there has been some discussion under PCMDI/CMOR#237 around the use of valid_min and valid_max in this case.

What I would like to confirm is (a) whether the 0 to 360 range for longitude is a firm requirement? (b) Is a discontinuity in longitudes (i.e. a jump from 359.5 to 0.5) acceptable in the middle of the longitude axis?

Other than the definition in the grid file noted above I can't find any document confirming this requirement, hence the question here.

We found this issue in the tripolar data produced by NEMO within HadGEM3.

matthew-mizielinski commented 6 years ago

Further note: I've had a look through the source for the CMOR v3.2.7 PrePARE tool and it only checks metadata by design, i.e. not coordinate data.

It is fairly trivial to write some code to validate the coordinate data against the information in the CMIP6_grids.json or CMIP6_coordinate.json files -- is this something that should be validated prior to ESGF publication?

dnadeau4 commented 6 years ago

PrePARE means "PRE-Publication Attribute Reviewers". It has been create to make sure you can find your data in the ESGF website. It does not read data but only metadata.

CMOR on the other hand will flip your array from -180,180 to 0,360, will check valid_min, vaild_max, verify that your axes are monotonic, make the file CF-1 compliant, etc

Some people prefer to create CMIP6 netCDF output themselves, which is fine, but CMOR is quite powerful at this time. You can put your netCDF file through CMOR if you want, we have many examples in the test suites. It is quite simple really....

Finally, PrePARE assert that your data (even if not CMIP6 compliant) is found by the ESGF search engine. ESGF search engine is similar to a library search you find in a bookstore, it knows nothing about the story.

matthew-mizielinski commented 6 years ago

@dnadeau4,

We are using CMOR v3.2.7 (@ehogan is leading the development of our conversion tool "mip_convert"), apologies if I did not make that clear above.

You may be correct that CMOR will deal with the longitude wrapping for data on a simple lat-lon grid, but as mentioned above the data is on a tripolar grid (common in the ocean modelling community). This means that the latitude and longitude coordinates have two dimensions, with cell corners described by the vertices_latitude and vertices_longitude variables in the output netCDF files. I have example files that I can provide which were successfully produced by CMOR (with a warning, as noted above) that have negative longitudes, hence the question.

An example of the header information, with part of the first longitude row showing the issue, from one of our test files can be found at https://gist.github.com/matthew-mizielinski/1cd69d67527efb42ea0e6646a34824c0

Could you please re-open this issue, as I need confirmation as to whether this needs to be dealt with.

dnadeau4 commented 6 years ago

@taylor13 Can you look at this.

taylor13 commented 6 years ago

@matthew-mizielinski Thanks for raising these issues; my impression is that CMOR3 has not be exercised heavily in producing data on non-latxlon grids. Thanks also for posting the ncdump.

The main things you'll be interested in here are items 2 and 4 below, but I've made some additional comments.

  1. for latxlon grids, the longitudes (taken from the CMIP5 requirements documented which carry over to CMIP6:

    must be ordered with longitude increasing from west to east, starting with the first 
    grid point greater than or equal to 0 degrees east.  All coordinate locations must be 
    unique (e.g., don't include both 0 and 360 degrees east).
  2. for other grids (as in your ocean grid), there is no hard requirement at this time. I would recommend that your longitudes fall in the range -180 to 360 (and limit the range to a single cycle) and when possible increase monotonically (with either i and/or j), but this is not a requirement of CF or of CMIP6. In particular a jump from 359.5 to 0.5 is acceptable, but if you can avoid it by assigning your longitudes in the range -180 to 180, then I think I'd do that. I invite other views on this, so please comment freely.

  3. As you say, PrePARe could be enhanced to check some of the coord. info., but this is not currently our highest priority.

  4. I noticed the following in your sample ncdump: a) vertices_longitude and vertices_latitude have standard_name = "vertices_longitude" and "vertices_latitude". This was a bug a CMOR3 that has just now been fixed; these variables do not need standard_names and the ones given were incorrect anyway. If you get the latest CMOR3 version, you'll correct this problem. b) Please check the following 2 global attributes: parent_time_units = "days since 1850-01-01" ; branch_time_in_parent = 0. ; These are supposed to make it easy to extract data from the piControl-spinup, which I'm guessing has different units and branch time from what indicated here. I'm not sure whether you'll publish the spinup portion of the run, but if you do, will these units reflect accurately what's in those files?

matthew-mizielinski commented 6 years ago

Thanks @taylor13,

Regarding point 2, I'm (very) happy with what you've said here and it makes processing of NEMO data (used by most European models) more straightforward, and will mirror how scientists currently work with data from these models making analysis easier.

However, I note that you've said ...there is no hard requirement at this time, which implies a small risk that at a later time this could change. Is there a procedure for agreeing this and setting down this requirement in one of the standards documents (e.g. the CMIP6 Output Grid Guidance)?

Regarding point 4a, these files are the same testing set that @piotrflorek and I used when raising PCMDI/CMOR#273 , so we'll pick up this fix when we next update CMOR.

Good catch on point 4b, I'll add this to my list of things to nail down before we start production (rather than test) processing the piControl simulation, as I hadn't previously worried much about the piControl-spinup run.

taylor13 commented 6 years ago

I didn't mean to imply that we might consider imposing a different requirement for CMIP6 data in the future. I should have said "I can find no requirements or recommendations in any CMIP5 or CMIP6 documents as to what range of values are acceptable." I will vet my recommendation (outlined in 2 above) with the WIP and if no objections include them in an output requirements document I'm preparing (which will be most easily met by using CMOR3).

matthew-mizielinski commented 6 years ago

Marvellous, I think this covers everything I was concerned about here.