cf-convention / cf-conventions

AsciiDoc Source
http://cfconventions.org/cf-conventions/cf-conventions
Creative Commons Zero v1.0 Universal
80 stars 44 forks source link

Require PROJ.4 compatible horizontal and vertical projection declarations. #187

Closed dblodgett-usgs closed 4 years ago

dblodgett-usgs commented 9 years ago

For any software to accurately interoperate with a geospatial dataset it must be given or make an assumption about the datum and projection used for the geospatial content. It is unacceptable to omit this information regardless of the scale or intended use of the data. Specification of the reference datum (horizontal and vertical) and projection (as applicable to the dimensionality of the data) should be a requirement akin to inclusion of units for coordinate variables. If the requirement for a dataset to include such metadata is considered too onerous for data producers who are unfamiliar with the datum their data uses, the CF community should adopt a default lat/lon/elevation datum and encourage software producers to standardize on that datum to foster consistency across the community. What default to use should be determined in consultation with the National Geodetic Survey and their counterparts internationally.

Proj.4 has been the de facto implementation of coordinate transformations, more or less, since the beginning of digital geospatial data. The ability to integrate CF-described geospatially referenced data with tools that implement the Proj.4 projection libraries is important.

Conversion of geospatial data into CF-described files requires CF support for the prevailing set of projections and reference datums.

Use of identifiers from the EPSG naming authority and conventions consistent with OGC-WKT should be supported. The issue that forces this assertion is the need for 'shift grids' to convert to/from non-parametric datums. This is of particular importance for vertical datums but is also important for the common NADCON conversion to/from the NAD27 datum.

In practice, codes defined by the EPSG naming authority, encoded either alone or as part of a WKT datum/projection declaration, are necessary for integration of data with web services and for conversion to and from other formats. Geospatial applications that desire to interoperate with CF should not be forced to construct utilities like this one.. This leads to the conclusion that proj.4 strings, EPSG codes, or WKT projections should be allowed for specification of projections.

rsignell-usgs commented 9 years ago

WKT is allowed in the yet-to-be-released CF 1.7 by specifying an attribute called crs_wkt in the grid_mapping variable. See CF ticket #69. This includes the ability to specify the vertical coordinate system via VERT_CS. See the example 5.11 in the ticket.

dblodgett-usgs commented 9 years ago

That's a great start, I was aware that some progress was being made on that front, but have a hard time keeping track of the trac email traffic.

That said, If a WKT declaration is going to continue to be optional, a default WKT string needs to adopted.

JonathanGregory commented 9 years ago

This issue has often been discussed on the CF email list. We don't have a default because CF doesn't only apply to real-world data. In fact most of the data written according to CF is model data, and there is no applicable default from real-world geodesy. Climate models assume unrealistically shaped worlds.

Another difference is that the aim of CF is provide self-explanatory geophysically based metadata. This means CF has a different approach from EPSG and OGC, whose metadata is not self-contained or self-explanatory (i.e. you have to look up what it means somewhere else), and appears (to me, anyway) rather unclear in geophysical terms. Rich and I have recently been discussing this wrt the vertical datum in particular. This is a good example, because the various vertical datums are different types of geophysical surface (geopotential, geoid, ellipsoid, mean sea level etc.). These are all distinct for CF. We are working on clarifying the correspondence.

As Rich says, it has been agreed to allow CRS WKT to be stored in CF-netCDF files. At the same time, it was agreed that the information should also be provided in other CF grid_mapping attributes. Ticket 69 http://cf-trac.llnl.gov/trac/ticket/69 says

"The crs_wkt attribute is intended to act as a supplement to other single-property CF grid mapping attributes (as described in Appendix F); it is not intended to replace those attributes. If data producers omit the single-property grid mapping attributes in favour of the compound crs_wkt attribute, software which cannot interpret crs_wkt will be unable to use the grid_mapping information. Therefore the CRS should be described as thoroughly as possible with the single-property attributes as well as by crs_wkt."

In ticket 80 http://cf-trac.llnl.gov/trac/ticket/80, which has also been accepted, some new grid_mapping attributes have been added to support the description of horizontal coordinate systems. In ticket 118 http://cf-trac.llnl.gov/trac/ticket/118 we are working on adding another attribute for the vertical datum.

Best wishes

Jonathan

dblodgett-usgs commented 9 years ago

Thanks for the comments, Jonathan.

A couple responses.

We don't have a default because CF doesn't only apply to real-world data. In fact most of the data written according to CF is model data, and there is no applicable default from real-world geodesy. Climate models assume unrealistically shaped worlds.

This is precisely the reason that there SHOULD be a default or a requirement to declare the information. Climate modelers who are assuming unrealistically shaped worlds need to declare the shape they are using or a software developer is forced to make an assumption that is very likely something 'realistic' and incorrect. By allowing the ommision of this important metadata, implementers have a 'NULL pointer' situation where they have to hack in an assumption of some sort. CF rules should not allow a 'NULL pointer' situation to occur.

This means CF has a different approach from EPSG and OGC, whose metadata is not self-contained or self-explanatory (i.e. you have to look up what it means somewhere else), and appears (to me, anyway) rather unclear in geophysical terms.

I'm not sure CF has the option to be different here. Horizontal datums like NAD27 or vertical datums like NAVD88 require extensive tables of observationally derived data to fully specify them. Additionally, by declaring a projection 'name' you've forced an implementer to go lookup the formulation of the projection math, not declared it in metadata, which would be impractical anyways.

I am in full support of inclusion of WKT projection declarations and am happy that they are being adopted. I want CF 2.0 to go one step further and require such information be included such that Geographic Software can (with no need for assumptions) read CF data. This means requiring the use of a PROJ.4 compatible projection declaration or adopting a CF-default (clearly the less ideal solution).

graybeal commented 9 years ago

I'm going to be heretical here and offer the following:

In short, I find David's arguments compelling.

BobSimons commented 9 years ago

The related problem with creating a contentious default (which is correct for ~half the existing datasets and wrong for ~half the existing datasets) now is that it would apply to all existing datasets. So ~half the current datasets would suddenly be incorrectly using the default. Given that most metadata is stored in files which take effort to change (and modifying the files brings up other issues), these datasets would likely remain wrong for a long time or forever.

On Thu, May 7, 2015 at 7:54 AM, David Blodgett notifications@github.com wrote:

Thanks for the comments, Jonathan.

A couple responses.

We don't have a default because CF doesn't only apply to real-world data. In fact most of the data written according to CF is model data, and there is no applicable default from real-world geodesy. Climate models assume unrealistically shaped worlds.

This is precisely the reason that there SHOULD be a default or a requirement to declare the information. Climate modelers who are assuming unrealistically shaped worlds need to declare the shape they are using or a software developer is forced to make an assumption that is very likely something 'realistic' and incorrect. By allowing the ommision of this important metadata, implementers have a 'NULL pointer' situation where they have to hack in an assumption of some sort. CF rules should not allow a 'NULL pointer' situation to occur.

This means CF has a different approach from EPSG and OGC, whose metadata is not self-contained or self-explanatory (i.e. you have to look up what it means somewhere else), and appears (to me, anyway) rather unclear in geophysical terms.

I'm not sure CF has the option to be different here. Horizontal datums like NAD27 or vertical datums like NAVD88 require extensive tables of observationally derived data to fully specify them. Additionally, by declaring a projection 'name' you've forced an implementer to go lookup the formulation of the projection math, not declared it in metadata, which would be impractical anyways.

I am in full support of inclusion of WKT projection declarations and am happy that they are being adopted. I want CF 2.0 to go one step further and require such information be included such that Geographic Software can (with no need for assumptions) read CF data. This means requiring the use of a PROJ.4 compatible projection declaration or adopting a CF-default (clearly the less ideal solution).

— Reply to this email directly or view it on GitHub https://github.com/cf-convention/CF-2/issues/11#issuecomment-99897383.

Sincerely,

Bob Simons IT Specialist Environmental Research Division NOAA Southwest Fisheries Science Center 99 Pacific St., Suite 255A (New!) Monterey, CA 93940 (New!) Phone: (831)333-9878 (New!) Fax: (831)648-8440 Email: bob.simons@noaa.gov

The contents of this message are mine personally and do not necessarily reflect any position of the Government or the National Oceanic and Atmospheric Administration. <>< <>< <>< <>< <>< <>< <>< <>< <><

BobSimons commented 9 years ago

I withdraw my earlier comment. I see: if a dataset says it follows a pre-default version of CF, then the default doesn't apply. If a dataset says it follows the with-default version of CF, then the default applies.

On Thu, May 7, 2015 at 8:29 AM, John Graybeal notifications@github.com wrote:

I'm going to be heretical here and offer the following:

  • Proposing a default in a newer version of CF does not break any data written with existing versions of CF.
  • If we are concerned about 'breaking' existing software by assuming a default, CF2 is the least painful place to break it.
  • I think without fixing this, CF itself is broken in the context of modern geospatial data standards.

In short, I find David's arguments compelling.

— Reply to this email directly or view it on GitHub https://github.com/cf-convention/CF-2/issues/11#issuecomment-99910650.

Sincerely,

Bob Simons IT Specialist Environmental Research Division NOAA Southwest Fisheries Science Center 99 Pacific St., Suite 255A (New!) Monterey, CA 93940 (New!) Phone: (831)333-9878 (New!) Fax: (831)648-8440 Email: bob.simons@noaa.gov

The contents of this message are mine personally and do not necessarily reflect any position of the Government or the National Oceanic and Atmospheric Administration. <>< <>< <>< <>< <>< <>< <>< <>< <><

JonathanGregory commented 9 years ago

CF tries not to impose unnecessary requirements on writers of data. I do not think it is reasonable or realistic to expect producers of model data to describe the figure of the Earth when it is not relevant, simply to avoid an inappropriate real-world default being applied. If the information is not supplied, the user of the data can assume whatever is convenient. It's not up to CF to make that decision. I am sure there are many real-world observational datasets as well where the precise frame of reference is not needed, because the data isn't that accurately located anyway.

Regarding vertical datums, I agree that some of them can only be identified by name, if it's a geoid or a geopotential surface for instance. Still, a name is more self-explanatory than a code, and therefore more consistent with the aim of self-describing datasets. Ellipsoids are defined by numbers, however, and there are grid_mapping attributes to do this.

If there are more grid_mapping attributes that are needed, beyond those already agreed in ticket 80, please propose them! That doesn't have to wait for CF-2, because it's not backwards-incompatible.

Jonathan

dblodgett-usgs commented 9 years ago

Jonathan,

The figure of the earth is always relevant. I am not sure how the notion that omission of metadata as a way to specify accuracy (or lack there of) of data has become an accepted practice, but it is not ok. It creates significant uncertainty (and mistrust) of CF data in communities that expect geospatial data to declare the datum used.

I'll take a different tact with this argument that may resonate in a different way.

Lets say one geospatial analyst takes a 1/8th degree (~12km) gridded dataset and assumes it uses a spherical earth according to NCEP's guidance. Another analyst assumes WGS 84. The grid cell edge can move by up to 4km. Say these two analysts need to compare their work. Say they see differences that have some impact on an important figure or a threshold that is significant. Because the data didn't declare what was actually used, these two analysts have a real problem. They need to do additional investigation to determined which one of them has made the correct assumption.

This is a situation I've found myself in repeatedly. It becomes circular and what you realize is that serious mistakes have been made again and again due to the disregard for this important metadata documenting the basic shape assumed for the lat/lon datum.

There is an example that illustrates how this general disregard can become significant written up here. I can expound on this with further examples, but will leave this here for now.

bnlawrence commented 9 years ago

Hi David

Both of your analysts should consider the difference between precision and accuracy. In this example, the actual accuracy of the data is of o(50-100km) (for a standard model, being between 4x8) the grid length. Any differences you get from a regrid projection could only be small compared with anything meaningful, unless up against orography, and then you have a different set of concerns (the model will be rubbish there anyway).

There is no meaningful datum associated with that data, and if you were comparing with obs, either of your analysts would be "right" to do what they did.

(That's a modellers answer, observational data should almost certainly come with a datum).

Bryan

On 7 May 2015 at 17:21, David Blodgett notifications@github.com wrote:

Jonathan,

The figure of the earth is always relevant. I am not sure how the notion that omission of metadata as a way to specify accuracy (or lack there of) of data has become an accepted practice, but it is not ok. It creates significant uncertainty (and mistrust) of CF data in communities that expect geospatial data to declare the datum used.

I'll take a different tact with this argument that may resonate in a different way.

Lets say one geospatial analyst takes a 1/8th degree (~12km) gridded dataset and assumes it uses a spherical earth according to NCEP's guidance. http://www.arl.noaa.gov/faq_ms1.php#Q7B Another analyst assumes WGS 84. The grid cell edge can move by up to 4km. Say these two analysts need to compare their work. Say they see differences that have some impact on an important figure or a threshold that is significant. Because the data didn't declare what was actually used, these two analysts have a real problem. They need to do additional investigation to determined which one of them has made the correct assumption.

This is a situation I've found myself in repeatedly. It becomes circular and what you realize is that serious mistakes have been made again and again due to the disregard for this important metadata documenting the basic shape assumed for the lat/lon datum.

There is an example that illustrates how this general disregard can become significant written up here. http://pubs.usgs.gov/fs/2013/3035/ I can expound on this with further examples, but will leave this here for now.

— Reply to this email directly or view it on GitHub https://github.com/cf-convention/CF-2/issues/11#issuecomment-99926822.

Bryan Lawrence University of Reading: Professor of Weather and Climate Computing. National Centre for Atmospheric Science: Director of Models and Data. STFC: Director of the Centre for Environmental Data Archival. Ph: +44 118 3786507 or 1235 445012; Web:home.badc.rl.ac.uk/lawrence

dblodgett-usgs commented 9 years ago

This argument is conflating a number of unrelated implementation details that need to be left out of a data standard. You can't introduce nuances of model physics when talking about data structure. For Example, the assertion doesn't hold up for situations like gridded representations of historical weather.

I appreciate the difference between precision and accuracy, the theoretical data in question has no statement about precision of coordinate variables and declares them as floating point values, that's the precision of the data as far as any user is concerned. A more appropriate way to specify the low precision of some gridded data would be through decreasing the precision of the coordinate variable values or through a more explicit statement about the precision. Making assumptions about datums causes structured bias in the 'low precision' spatial data, not the desired random error or explicit uncertainty.

Is there a proposal to allow specification of the intended precision of coordinate variables in CF? If there is, it would need to be in addition to the requirement to specify the datum assumed for lat/lon/elev coordinate variables to avoid the structural bias inherent in these sorts of assumptions.

(That's a modellers answer, observational data should almost certainly come with a datum).

What happens when the uncertainty in model grid cell location presents significant difficulty for model result inter-comparison with observations and other models?

bnlawrence commented 9 years ago

Hi David

The issue of what a grid means to a model is far from unrelated, and is the heart of the problem.

I've had - and watched - this discussion a few times, and it's pretty obvious that every time, the proponents (you and me in this instance) talk past each other. In trying to understand why, the best analogy is that we are in a situation that is linguistically analogous to two things: some concepts are not important in other languages, and we can have "false friends" (in the Spanish/French) sense ... and this is one of them. A grid is not a grid :-)

A model grid - in many cases - is simply not directly analogous to a datum referenced coordinate system.

In the case of a model, we're dealing (mostly) with a continuous mathematical function that we sample at points, representative of the value over a grid box (so the CF cell bounds ought to be an important part of this conversation, but are not, because there is no generally have no directly comparable analogy in a real world sampling situation - because clearly that's discontinuous). So, I'm not talking about error in the data per se, or uncertainty, I'm talking about the fact that we're often using a point as an indicator of a function value over a large area (much larger than the inter-grid spacing). Add in the fact that much (climate) model data has a calendar that does not represent the real world ... one can over interpret the meaning of the horizontal grid.

None of what I'm suggesting is that real world datum are not important in many real world cases. Perhaps the right answer here to meet your objections could be that the default could be "Not Applicable" ... and in any case where the data originator thinks the differences you identify could be important, they ought to provide a proper datum. It's certainly the case that any differences that arise from regridding should not be significant, and are likely to be much smaller than those that arise from inter-ensemble member differences ... (whatever the model ensemble axis, if there is one, if there isn't one then most of this argument is pointless).

Bryan

On 7 May 2015 at 17:50, David Blodgett notifications@github.com wrote:

This argument is conflating a number of unrelated implementation details that need to be left out of a data standard. You can't introduce nuances of model physics when talking about data structure. The assertion doesn't hold up for situations like gridded representations of historical weather for example.

I appreciate the difference between precision and accuracy, the theoretical data in question has no statement about precision of coordinate variables and declares them as floating point values, that's the precision of the data as far as any user is concerned. A more appropriate way to specify the low precision of some gridded data would be through decreasing the precision of the coordinate variable values or through a more explicit statement about the precision. Making assumptions about datums causes structured bias in the 'low precision' spatial data, not the desired random error or explicit uncertainty.

Is there a proposal to allow specification of the intended precision of coordinate variables in CF? If there is, it would need to be in addition to the requirement to specify the datum assumed for lat/lon/elev coordinate variables to avoid the structural bias inherent in these sorts of assumptions.

(That's a modellers answer, observational data should almost certainly come with a datum).

What happens when the uncertainty in model grid cell location presents significant difficulty for model result inter-comparison with observations and other models?

— Reply to this email directly or view it on GitHub https://github.com/cf-convention/CF-2/issues/11#issuecomment-99936431.

Bryan Lawrence University of Reading: Professor of Weather and Climate Computing. National Centre for Atmospheric Science: Director of Models and Data. STFC: Director of the Centre for Environmental Data Archival. Ph: +44 118 3786507 or 1235 445012; Web:home.badc.rl.ac.uk/lawrence

dblodgett-usgs commented 9 years ago

Bryan,

This is very helpful, thanks for laying it out this way. I actually wouldn't say we are talking past each other, we just have somewhat orthogonal points of view. My concern is rooted in interdisciplinary model coupling as well as in unambiguous representation of data such that people recognize the inherent complexity in the information and don't struggle to find information they expect. Your perspective, rooted in model grids which discritize a continuous function space, is understood (I think) and I appreciate what you are saying.

I will have to think about this a bit.

If we are going to insist that a lat/lon coordinate variable can be valid without a datum declaration, we are really talking about a different concept altogether. It should be model_lat/model_lon instead of lat/lon. In my definition of lat/lon the concept is not fully defined with an undefined ellipse and prime meridian. In your definition, there is no explicit tie to the earth and the grid cells have unitless volume? If there is a tie to earth and the volume of cells have units, then there must be an assumed prime meridian in Greenwich and an assumed sphere roughly the diameter of the real earth? In that case, we are actually talking about the same concepts and I start to have a lot of trouble with the argument that these things can be left unsaid.

MaartenSneepKNMI commented 9 years ago

I think I agree with David. The inability to precisely specify real world coordinates of observational data (datum et al. included) is missing from CF. Perhaps not surprising given the origin of CF, but CF is gaining traction in other geophysical disciplines as well. If only for that reason I think specification of the reference coordinate system is required.

rsignell-usgs commented 9 years ago

Folks,

I understand that it's a tough sell to convince climate modelers to include metadata that they themselves don't need.

But being reasonable people, I'm sure they can appreciate that it's difficult for generic software (and humans) to differentiate between cases where: (1) the datum is missing because the model dataset is low enough resolution or inaccurate enough that the datum is not considered important and therefore simply assigning some datum is okay (2) the datum is just missing, and simply assigning some datum might give the wrong result.

It really wouldn't be that hard to add this metadata, in many cases just a spherical earth with a specified radius. I think CF-2.0 should require it.

On Fri, May 8, 2015 at 11:18 AM, Maarten Sneep notifications@github.com wrote:

I think I agree with David. The inability to precisely specify real world coordinates of observational data (datum et al. included) is missing from CF. Perhaps not surprising given the origin of CF, but CF is gaining traction in other geophysical disciplines as well. If only for that reason I think specification of the reference coordinate system is required.

— Reply to this email directly or view it on GitHub https://github.com/cf-convention/CF-2/issues/11#issuecomment-100269797.

Dr. Richard P. Signell (508) 457-2229 USGS, 384 Woods Hole Rd. Woods Hole, MA 02543-1598

graybeal commented 9 years ago

If there is no way currently to indicate that the datum would be irrelevant (or that it is unknown, if that's a sane use case), can we add a way to indicate that? Or is that crazy talk?

John

On May 8, 2015, at 08:39, Rich Signell notifications@github.com wrote:

Folks,

I understand that it's a tough sell to convince climate modelers to include metadata that they themselves don't need.

But being reasonable people, I'm sure they can appreciate that it's difficult for generic software (and humans) to differentiate between cases where: (1) the datum is missing because the model dataset is low enough resolution or inaccurate enough that the datum is not considered important and therefore simply assigning some datum is okay (2) the datum is just missing, and simply assigning some datum might give the wrong result.

It really wouldn't be that hard to add this metadata, in many cases just a spherical earth with a specified radius. I think CF-2.0 should require it.

On Fri, May 8, 2015 at 11:18 AM, Maarten Sneep notifications@github.com wrote:

I think I agree with David. The inability to precisely specify real world coordinates of observational data (datum et al. included) is missing from CF. Perhaps not surprising given the origin of CF, but CF is gaining traction in other geophysical disciplines as well. If only for that reason I think specification of the reference coordinate system is required.

— Reply to this email directly or view it on GitHub https://github.com/cf-convention/CF-2/issues/11#issuecomment-100269797.

Dr. Richard P. Signell (508) 457-2229 USGS, 384 Woods Hole Rd. Woods Hole, MA 02543-1598 — Reply to this email directly or view it on GitHub.

JonathanGregory commented 9 years ago

Dear all

I can't agree that CF should demand a specification of the figure of the Earth. If this information is irrelevant data-writers should not be expected to provide it. Such a requirement would not serve the purpose because either it would not be done at all (leading to non-compliant datasets) or people would just put something irrelevant in there to meet the requirement formally (leading to incorrect datasets).

I agree with Bryan that the default is "not applicable". That is, if the data-writer does not provide the information, they are declaring that it is not needed for any purpose they intend the dataset to be used for. We could state this as the default in the CF convention.

Deliberate imprecision is supported by CF. David is right to say that, in a geodetically precise sense, model latitude and real-world latitude are not the same thing. However, the main purpose of CF is to provide metadata to enable comparison of datasets. Users want to compare model and observational datasets. For that extremely important purpose, we regard the model world and the real world as the same planet. They use comparable coordinate systems. In detail they are different, but we want to facilitate comparison by pointing out the similarities.

Best wishes

Jonathan

dblodgett-usgs commented 9 years ago

The information is never irrelevant unless you are ONLY using spatial units of degrees and you do not care where the prime-meridian is. If the oceans and continents are in your model, then you are using a datum.

To put this another way, 'not applicable' would mean I could say the earth radius is 500km and the prime meridian is in Chicago and be ok. Or do I actually need to know that the prime meridian is near Greenwich and that the earth radius is about 6000km?

Given that the latter is the case, a proposal that I think might balance concerns here:

If datum information for latitude-longitude coordinate variables is not declared, software implementers should assume that the prime meridian (0 latitude) is in Greenwich and the reference ellipsoid is any earth like ellipsoid. In order to foster consistency across the community, GIS software that implements the WGS84 datum should assume that as a default. Software that does not implement the WGS84 datum are free to assume any reference ellipsoid meant to represent the earth.

graybeal commented 9 years ago

Putting these two together satisfies my desire for a meaningful default.

It might be particularly helpful if the Greenwich/WGS84 datum is offered as an example specification, so that providers who want to be good citizens can include it easily.

While at some point a more general spec might be needed (that is, to allow for a prime meridian anywhere, or an arbitrary sized planet), but I can't imagine a valid CF use case for that. Probably would be a different standard!

John

On May 8, 2015, at 08:59, JonathanGregory notifications@github.com wrote:

I agree with Bryan that the default is "not applicable". That is, if the data-writer does not provide the information, they are declaring that it is not needed for any purpose they intend the dataset to be used for. We could state this as the default in the CF convention.

On May 8, 2015, at 09:40, David Blodgett notifications@github.com wrote:

If datum information for latitude-longitude coordinate variables is not declared, software implementers should assume that the prime meridian (0 latitude) is in Greenwich and the reference ellipsoid is any earth like ellipsoid. In order to foster consistency across the community, GIS software that implements the WGS84 datum should assume that as a default. Software that does not implement the WGS84 datum are free to assume any reference ellipsoid meant to represent the earth.

bnlawrence commented 9 years ago

While at some point a more general spec might be needed (that is, to allow for a prime meridian anywhere, or an arbitrary sized planet), but I can't imagine a valid CF use case for that. Probably would be a different standard!

... we do have Martian atmosphere data in CF at BADC ... :-)

Bryan

Bryan Lawrence University of Reading: Professor of Weather and Climate Computing. National Centre for Atmospheric Science: Director of Models and Data. STFC: Director of the Centre for Environmental Data Archival. Ph: +44 118 3786507 or 1235 445012; Web:home.badc.rl.ac.uk/lawrence

dblodgett-usgs commented 9 years ago

The existing CF spec does allow specification of an arbitrary semi_major_axis and semi_minor_axis. The issue is in longitude_of_prime_meridian which is a longitude relative to Greenwich. Fortunately, there is a similar prime meridian on mars.

If we assume that this point is equivalent to Greenwich, then using a latitude_longitude grid mapping with the semi_major_axis, semi_minor_axis, and longitude_of_prime_meridian would fully specify what the latitude and longitude coordinate variables mean on Mars.

JonathanGregory commented 9 years ago

Yes, I think it would be fine to say that the prime meridian is Greenwich if not stated. In fact until recently I was not aware that longitude could mean anything else but wrt Greenwich i.e. that it's implied by the definition of longitude itself. The question never came up for at least the first decade of CF, so I presume this is a common assumption of climate/ocean/atmosphere scientists. I think this is a change we could make immediately; it's not backwards-incompatible.

I'm not really comfortable with stating that there is an Earth-like vertical datum. It might not be the Earth, as Bryan says, and for many purposes you could use a Martian dataset as a function of latitude and longitude without knowing the shape of Mars. If the data-writer does not provide this information, but the data-user needs the information, they can make any convenient and relevant assumption. I don't think it's up to CF to provide information if it isn't provided by the data-writer. I see that as different from the longitude question because you cannot use longitude at all if you don't know where its zero is, but you can use lat-lon data perfectly satisfactorily for many purposes without knowing the precise shape of the planet.

Jonathan

chris-little commented 9 years ago

I apologise about coercing this Github datum discussion into a temporal datum discussion, but I do not know how else to place a marker in the CF2/Github debate as well as the CF 1.7 arena. CL

Dear CF Land,

I have been avidly reading and lurking on this debate, and thought it would be useful to state what we have done, and intend to do, in the OGC Temporal Domain WG, which is three things:

  1. We have written an incomplete, draft Best Practice for temporal aspects of geo-spatial data.
  2. We have registered several temporal coordinate reference systems under the aegis of the OGC Naming Authority. These are resolvable via structured, sort of human readable, URIs.
  3. We are establishing a Standards Working Group to register a couple of calendars in a related, but separate registry branch of URI naming structure. Namely, the 360 day year and the 365 day year. These are purely for labelling. There will be no attempt to standardise conversions, though this would undoubtedly be a ‘good thing’.

We are trying to educate the OGC community not to make the categorical error of assuming that CRSs, Calendars and Notations are the same thing. They are three different things:

a. A CRS has a monotonic number line, an origin (epoch), and normal (real) arithmetic.

b. A Calendar does not have normal arithmetic. A very simple example is the count of years of the current era (CE and BCE, originally AD and BC): no year zero, so abnormal arithmetic.

c. Many Notations can be invented that are fully ordered and monotonic, though not necessarily regular, but makes no assumptions about durations, arithmetic, calendars, CRSs etc. ISO8601, IETF RFC3339 and similar contain examples. That they look like CRSs or Calendars is part of the problem.

There are a couple of logical quirks:

A. How to specify the epoch - this is a bit chicken and egg, but we all know the egg came first (ask me later over a beer unless you are a creationist).

B. The 360 day year could be viewed both as a calendar AND a CRS as the units are all well behaved and use normal arithmetic - 1yr = 12mon = 360d = 360x24hr = 360x24x60x60s

I wish you success in choosing your conventions and balancing backward compatibility and moving forward.

Chris

Chris Little

Co-Chair, OGC Meteorology & Oceanography Domain Working Group and Temporal DWG

IT Fellow - Operational Infrastructures

Met Office FitzRoy Road Exeter Devon EX1 3PB United Kingdom

Tel: +44(0)1392 886278 Fax: +44(0)1392 885681 Mobile: +44(0)7753 880514

E-mail: chris.little@metoffice.gov.uk http://www.metoffice.gov.uk

I am normally at work Tuesday, Wednesday and Thursday each week

From: David Blodgett [mailto:notifications@github.com] Sent: Thursday, April 30, 2015 2:35 PM To: cf-convention/CF-2 Subject: [CF-2] Require PROJ.4 compatible horizontal and vertical projection declarations. (#11)

For any software to accurately interoperate with a geospatial dataset it must be given or make an assumption about the datum and projection used for the geospatial content. It is unacceptable to omit this information regardless of the scale or intended use of the data. Specification of the reference datum (horizontal and vertical) and projection (as applicable to the dimensionality of the data) should be a requirement akin to inclusion of units for coordinate variables. If the requirement for a dataset to include such metadata is considered too onerous for data producers who are unfamiliar with the datum their data uses, the CF community should adopt a default lat/lon/elevation datum and encourage software producers to standardize on that datum to foster consistency across the community. What default to use should be determined in consultation with the National Geodetic Surveyhttp://www.ngs.noaa.gov/GEOID/ and their counterparts internationally.

Proj.4https://trac.osgeo.org/proj/%5D has been the de facto implementation of coordinate transformations, more or less, since the beginning of digital geospatial data. The ability to integrate CF-described geospatially referenced data with tools that implement the Proj.4 projection libraries is important.

Conversion of geospatial data into CF-described files requires CF support for the prevailing set of projectionshttp://www.remotesensing.org/geotiff/proj_list/ and reference datums.

Use of identifiers from the EPSG naming authority and conventions consistent with OGC-WKT should be supported. The issue that forces this assertion is the need for 'shift grids' to convert to/from non-parametric datums. This is of particular importance for vertical datumshttps://trac.osgeo.org/proj/wiki/VerticalDatums but is also important for the common NADCONhttp://www.ngs.noaa.gov/cgi-bin/nadcon.prl conversion to/from the NAD27 datum.

In practice, codes defined by the EPSG naming authority, encoded either alone or as part of a WKT datum/projection declaration, are necessary for integration of data with web services and for conversion to and from other formats. Geospatial applications that desire to interoperate with CF should not be forced to construct utilities like this one.https://github.com/USGS-CIDA/geo-data-portal/blob/master/gdp-core-processing/src/main/java/gov/usgs/cida/gdp/coreprocessing/analysis/grid/CRSUtility.java. This leads to the conclusion that proj.4 strings, EPSG codes, or WKT projections should be allowed for specification of projections.

— Reply to this email directly or view it on GitHubhttps://github.com/cf-convention/CF-2/issues/11.

chris-little commented 9 years ago

Chris,

The separation into CRS, Calendar, and Notation is excellent! Are you taking the approach that a time system such as UTC constitutes part of a calendar? In your terms am I right in thinking that International Atomic Time (TAI) and GPS time would be CRSs, each coupled with a no-leapsecond calendar and the standard (yyyy-mm-dd hh:mm:ss.sss) notation? And that UTC would be, in essence the TAI CRS coupled with the UTC leapsecond calendar and the standard time notation?

Grace and peace,

Jim

[image: CICS-NC] http://www.cicsnc.org/Visit us on Facebook http://www.facebook.com/cicsnc_Jim Biard_ Research Scholar Cooperative Institute for Climate and Satellites NC http://cicsnc.org/ North Carolina State University http://ncsu.edu/ NOAA National Centers for Environmental Information http://ncdc.noaa.gov/ formerly NOAA’s National Climatic Data Center 151 Patton Ave, Asheville, NC 28801 e: jbiard@cicsnc.org o: +1 828 271 4900

We will be updating our social media soon. Follow our current Facebook (NOAA National Climatic Data Center https://www.facebook.com/NOAANationalClimaticDataCenter and NOAA National Oceanographic Data Center https://www.facebook.com/noaa.nodc) and Twitter (@NOAANCDC https://twitter.com/NOAANCDC and @NOAAOceanData https://twitter.com/NOAAOceanData) accounts for the latest information.

On Thu, May 21, 2015 at 4:59 AM, Little, Chris < chris.little@metoffice.gov.uk> wrote:

I apologise about coercing this Github datum discussion into a temporal datum discussion, but I do not know how else to place a marker in the CF2/Github debate as well as the CF 1.7 arena. CL

Dear CF Land,

I have been avidly reading and lurking on this debate, and thought it would be useful to state what we have done, and intend to do, in the OGC Temporal Domain WG, which is three things:

  1. We have written an incomplete, draft Best Practice for temporal aspects of geo-spatial data.
  2. We have registered several temporal coordinate reference systems under the aegis of the OGC Naming Authority. These are resolvable via structured, sort of human readable, URIs.
  3. We are establishing a Standards Working Group to register a couple of calendars in a related, but separate registry branch of URI naming structure. Namely, the 360 day year and the 365 day year. These are purely for labelling. There will be no attempt to standardise conversions, though this would undoubtedly be a ‘good thing’.

We are trying to educate the OGC community not to make the categorical error of assuming that CRSs, Calendars and Notations are the same thing. They are three different things:

a. A CRS has a monotonic number line, an origin (epoch), and normal (real) arithmetic.

b. A Calendar does not have normal arithmetic. A very simple example is the count of years of the current era (CE and BCE, originally AD and BC): no year zero, so abnormal arithmetic.

c. Many Notations can be invented that are fully ordered and monotonic, though not necessarily regular, but makes no assumptions about durations, arithmetic, calendars, CRSs etc. ISO8601, IETF RFC3339 and similar contain examples. That they look like CRSs or Calendars is part of the problem.

There are a couple of logical quirks:

A. How to specify the epoch - this is a bit chicken and egg, but we all know the egg came first (ask me later over a beer unless you are a creationist).

B. The 360 day year could be viewed both as a calendar AND a CRS as the units are all well behaved and use normal arithmetic - 1yr = 12mon = 360d = 360x24hr = 360x24x60x60s

I wish you success in choosing your conventions and balancing backward compatibility and moving forward.

Chris

Chris Little

Co-Chair, OGC Meteorology & Oceanography Domain Working Group and Temporal DWG

IT Fellow - Operational Infrastructures

Met Office FitzRoy Road Exeter Devon EX1 3PB United Kingdom

Tel: +44(0)1392 886278 Fax: +44(0)1392 885681 Mobile: +44(0)7753 880514

E-mail: chris.little@metoffice.gov.uk http://www.metoffice.gov.uk

I am normally at work Tuesday, Wednesday and Thursday each week

From: David Blodgett [mailto:notifications@github.com] Sent: Thursday, April 30, 2015 2:35 PM To: cf-convention/CF-2 Subject: [CF-2] Require PROJ.4 compatible horizontal and vertical projection declarations. (#11)

For any software to accurately interoperate with a geospatial dataset it must be given or make an assumption about the datum and projection used for the geospatial content. It is unacceptable to omit this information regardless of the scale or intended use of the data. Specification of the reference datum (horizontal and vertical) and projection (as applicable to the dimensionality of the data) should be a requirement akin to inclusion of units for coordinate variables. If the requirement for a dataset to include such metadata is considered too onerous for data producers who are unfamiliar with the datum their data uses, the CF community should adopt a default lat/lon/elevation datum and encourage software producers to standardize on that datum to foster consistency across the community. What default to use should be determined in consultation with the National Geodetic Survey http://www.ngs.noaa.gov/GEOID/ and their counterparts internationally.

Proj.4 https://trac.osgeo.org/proj/%5D has been the de facto implementation of coordinate transformations, more or less, since the beginning of digital geospatial data. The ability to integrate CF-described geospatially referenced data with tools that implement the Proj.4 projection libraries is important.

Conversion of geospatial data into CF-described files requires CF support for the prevailing set of projections http://www.remotesensing.org/geotiff/proj_list/ and reference datums.

Use of identifiers from the EPSG naming authority and conventions consistent with OGC-WKT should be supported. The issue that forces this assertion is the need for 'shift grids' to convert to/from non-parametric datums. This is of particular importance for vertical datums https://trac.osgeo.org/proj/wiki/VerticalDatums but is also important for the common NADCON http://www.ngs.noaa.gov/cgi-bin/nadcon.prl conversion to/from the NAD27 datum.

In practice, codes defined by the EPSG naming authority, encoded either alone or as part of a WKT datum/projection declaration, are necessary for integration of data with web services and for conversion to and from other formats. Geospatial applications that desire to interoperate with CF should not be forced to construct utilities like this one. https://github.com/USGS-CIDA/geo-data-portal/blob/master/gdp-core-processing/src/main/java/gov/usgs/cida/gdp/coreprocessing/analysis/grid/CRSUtility.java. This leads to the conclusion that proj.4 strings, EPSG codes, or WKT projections should be allowed for specification of projections.

— Reply to this email directly or view it on GitHub https://github.com/cf-convention/CF-2/issues/11.


CF-metadata mailing list CF-metadata@cgd.ucar.edu http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

chris-little commented 9 years ago

Jim,

Good question!

Actually, there are two other categories of fundamental things in the OGC draft Best Practice doc:

  1.  An events regime (e.g. tree rings, ice cores, archaeological layers, king lists), where partial or full ordering can be deduced, but there is no actual ‘measure’ of time. The Allen algebraic operators can be applied (before, during, after, etc) but no subtraction or addition!
  2.  A TimeScale, such as TAI, where there are only ticks and an integer count from an origin/epoch (tick 0). Very physical. This is also the logical basis for relativistic time. Adding and subtracting of integers allowed.

A CRS is a timescale that has been interpolated between ticks and extrapolated backwards and forwards using normal real arithmetic.

In these term, I suppose that I posit the UTC as defined is TAI, converted to a CRS with one second ticks, plus a calendar (epoch displaced by quite a few seconds for solar and GPS alignment, Gregorian algorithms, IERS leap seconds)

So I think we agree, with perhaps a little terminology to be refined and thrashed out. Of course, we have also defined the SI second in there somewhere.

Chris

From: Jim Biard [mailto:jbiard@cicsnc.org] Sent: Thursday, May 21, 2015 2:18 PM To: Little, Chris Cc: cf-convention/CF-2; cf-metadata@cgd.ucar.edu; Piero Campalani; Matthias Müller Subject: Re: [CF-metadata] OGC Temporal Best Practice (draft). Was: [CF-2] Require PROJ.4 compatible... projection ... (#11)

Chris,

The separation into CRS, Calendar, and Notation is excellent! Are you taking the approach that a time system such as UTC constitutes part of a calendar? In your terms am I right in thinking that International Atomic Time (TAI) and GPS time would be CRSs, each coupled with a no-leapsecond calendar and the standard (yyyy-mm-dd hh:mm:ss.sss) notation? And that UTC would be, in essence the TAI CRS coupled with the UTC leapsecond calendar and the standard time notation?

Grace and peace,

Jim

[CICS-NC]http://www.cicsnc.org/Visit us on Facebookhttp://www.facebook.com/cicsnc

Jim Biard Research Scholar Cooperative Institute for Climate and Satellites NC http://cicsnc.org/ North Carolina State University http://ncsu.edu/ NOAA National Centers for Environmental Information http://ncdc.noaa.gov/ formerly NOAA’s National Climatic Data Center 151 Patton Ave, Asheville, NC 28801 e: jbiard@cicsnc.orgmailto:jbiard@cicsnc.org o: +1 828 271 4900

We will be updating our social media soon. Follow our current Facebook (NOAA National Climatic Data Centerhttps://www.facebook.com/NOAANationalClimaticDataCenter and NOAA National Oceanographic Data Centerhttps://www.facebook.com/noaa.nodc) and Twitter (@NOAANCDChttps://twitter.com/NOAANCDC and @NOAAOceanDatahttps://twitter.com/NOAAOceanData) accounts for the latest information.

On Thu, May 21, 2015 at 4:59 AM, Little, Chris chris.little@metoffice.gov.uk<mailto:chris.little@metoffice.gov.uk> wrote: I apologise about coercing this Github datum discussion into a temporal datum discussion, but I do not know how else to place a marker in the CF2/Github debate as well as the CF 1.7 arena. CL

Dear CF Land,

I have been avidly reading and lurking on this debate, and thought it would be useful to state what we have done, and intend to do, in the OGC Temporal Domain WG, which is three things:

  1. We have written an incomplete, draft Best Practice for temporal aspects of geo-spatial data.
  2. We have registered several temporal coordinate reference systems under the aegis of the OGC Naming Authority. These are resolvable via structured, sort of human readable, URIs.
  3. We are establishing a Standards Working Group to register a couple of calendars in a related, but separate registry branch of URI naming structure. Namely, the 360 day year and the 365 day year. These are purely for labelling. There will be no attempt to standardise conversions, though this would undoubtedly be a ‘good thing’.

We are trying to educate the OGC community not to make the categorical error of assuming that CRSs, Calendars and Notations are the same thing. They are three different things:

a. A CRS has a monotonic number line, an origin (epoch), and normal (real) arithmetic.

b. A Calendar does not have normal arithmetic. A very simple example is the count of years of the current era (CE and BCE, originally AD and BC): no year zero, so abnormal arithmetic.

c. Many Notations can be invented that are fully ordered and monotonic, though not necessarily regular, but makes no assumptions about durations, arithmetic, calendars, CRSs etc. ISO8601, IETF RFC3339 and similar contain examples. That they look like CRSs or Calendars is part of the problem.

There are a couple of logical quirks:

A. How to specify the epoch - this is a bit chicken and egg, but we all know the egg came first (ask me later over a beer unless you are a creationist).

B. The 360 day year could be viewed both as a calendar AND a CRS as the units are all well behaved and use normal arithmetic - 1yr = 12mon = 360d = 360x24hr = 360x24x60x60s

I wish you success in choosing your conventions and balancing backward compatibility and moving forward.

Chris

Chris Little

Co-Chair, OGC Meteorology & Oceanography Domain Working Group and Temporal DWG

IT Fellow - Operational Infrastructures

Met Office FitzRoy Road Exeter Devon EX1 3PB United Kingdom

Tel: +44(0)1392 886278tel:%2B44%280%291392%20886278 Fax: +44(0)1392 885681tel:%2B44%280%291392%20885681 Mobile: +44(0)7753 880514tel:%2B44%280%297753%20880514

E-mail: chris.little@metoffice.gov.ukmailto:chris.little@metoffice.gov.uk http://www.metoffice.gov.uk

I am normally at work Tuesday, Wednesday and Thursday each week

From: David Blodgett [mailto:notifications@github.commailto:notifications@github.com] Sent: Thursday, April 30, 2015 2:35 PM To: cf-convention/CF-2 Subject: [CF-2] Require PROJ.4 compatible horizontal and vertical projection declarations. (#11)

For any software to accurately interoperate with a geospatial dataset it must be given or make an assumption about the datum and projection used for the geospatial content. It is unacceptable to omit this information regardless of the scale or intended use of the data. Specification of the reference datum (horizontal and vertical) and projection (as applicable to the dimensionality of the data) should be a requirement akin to inclusion of units for coordinate variables. If the requirement for a dataset to include such metadata is considered too onerous for data producers who are unfamiliar with the datum their data uses, the CF community should adopt a default lat/lon/elevation datum and encourage software producers to standardize on that datum to foster consistency across the community. What default to use should be determined in consultation with the National Geodetic Surveyhttp://www.ngs.noaa.gov/GEOID/ and their counterparts internationally.

Proj.4https://trac.osgeo.org/proj/%5D has been the de facto implementation of coordinate transformations, more or less, since the beginning of digital geospatial data. The ability to integrate CF-described geospatially referenced data with tools that implement the Proj.4 projection libraries is important.

Conversion of geospatial data into CF-described files requires CF support for the prevailing set of projectionshttp://www.remotesensing.org/geotiff/proj_list/ and reference datums.

Use of identifiers from the EPSG naming authority and conventions consistent with OGC-WKT should be supported. The issue that forces this assertion is the need for 'shift grids' to convert to/from non-parametric datums. This is of particular importance for vertical datumshttps://trac.osgeo.org/proj/wiki/VerticalDatums but is also important for the common NADCONhttp://www.ngs.noaa.gov/cgi-bin/nadcon.prl conversion to/from the NAD27 datum.

In practice, codes defined by the EPSG naming authority, encoded either alone or as part of a WKT datum/projection declaration, are necessary for integration of data with web services and for conversion to and from other formats. Geospatial applications that desire to interoperate with CF should not be forced to construct utilities like this one.https://github.com/USGS-CIDA/geo-data-portal/blob/master/gdp-core-processing/src/main/java/gov/usgs/cida/gdp/coreprocessing/analysis/grid/CRSUtility.java. This leads to the conclusion that proj.4 strings, EPSG codes, or WKT projections should be allowed for specification of projections.

— Reply to this email directly or view it on GitHubhttps://github.com/cf-convention/CF-2/issues/11.


CF-metadata mailing list CF-metadata@cgd.ucar.edumailto:CF-metadata@cgd.ucar.edu http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

dblodgett-usgs commented 4 years ago

Moving this here for posterity. I think it was fixed in: http://cfconventions.org/cf-conventions/cf-conventions.html#use-of-the-crs-well-known-text-format