geopython / pygeoapi

pygeoapi is a Python server implementation of the OGC API suite of standards. The project emerged as part of the next generation OGC API efforts in 2018 and provides the capability for organizations to deploy a RESTful OGC API endpoint using OpenAPI, GeoJSON, and HTML. pygeoapi is open source and released under an MIT license.
https://pygeoapi.io
MIT License
488 stars 261 forks source link

CRS handling in xarray provider properties #1641

Closed sjordan29 closed 3 weeks ago

sjordan29 commented 5 months ago

Overview

As discussed in #1578, I was experiencing a build failure for several EDR datasets that had a crs variable that did not have very specific attributes.

This solution adds a storage_crs variable/attribute to the XarrayProvider -- I named it that based on CRS handling in the features provider, but am open to other naming schemes if you think something else is more appropriate. I had considered using the crs under extents in the config, but could see an instance where the extents are in WGS84 while the data itself are projected to some other coordinate system.

storage_crs is defined with the added _parse_storage_crs method, which does the following:

  1. Looks to the config file for a storage_crs. As noted in the documentation change, this value in the config could be anything accepted by pyproj.CRS.from_user_input. If no storage_crs is found in the config, we attempt to parse the crs information directly from the zarr/netCDF file with the following steps.
  2. Start by checking to see if the xarray dataset is CF-compliant (details on grid mapping here). This code looks to see if there is a grid mapping attribute for spatiotemporal variables. If there is, then we will parse crs information from the variable with that name in the xarray using pyproj.CRS.from_cf.
  3. If a grid mapping does not exist, but a variable named crs does, then we will attempt to parse the crs information from the attributes of that variable using pyproj.CRS.from_dict.
  4. If all else fails, assume a default storage CRS of http://www.opengis.net/def/crs/OGC/1.3/CRS84

In this fix, we just use storage_crs to define properties for the dataset, including bbox_crs, inverse_flattening, and crs_type. I'd envision in the future, the storage_crs could be used much like it is for features, or as crs_src is handled in the rasterio_.py to support a CRS query parameter.

I appreciate any feedback on any additional widely-used cases we should consider for CRS handling in the xarray provider and/or chagnes to make this more helpful for addressing future EDR development (e.g., #1104).

Related Issue / discussion

Addresses #1578 Will help pave a way forward for #1104

Dependency policy (RFC2)

Updates to public demo

The netcdf file in the config above does not have a CRS variable or any grid mapping information, so the changes would assume WGS84 lat/lon.

Contributions and licensing

(as per https://github.com/geopython/pygeoapi/blob/master/CONTRIBUTING.md#contributions-and-licensing)