JuliaGeo / CommonDataModel.jl

A Data Model for NetCDF, GRIB, Zarr and geoTIFF data sets
MIT License
11 stars 4 forks source link

CRS #19

Open rafaqz opened 9 months ago

rafaqz commented 9 months ago

In both GRIBB and NetCDF there is a need to extract CRS data. e.g:

https://discourse.julialang.org/t/extracting-projection-from-grib-files/109818/3

It would be good if CommonDataModel.jl handled this, probably adding methods to GeoInterface.crs as function to retrieve it.

I am aware of the difficulty (and in edge-cases actual impossibility) of converting CF standard crs data to Well Known Text or Proj. But we should try. Currently the workaround is to load and call ArchGDAL.jl or GMT.jl to get the crs manually, but an additional large binary dependency for this small task is less than ideal.

I made an attempt at starting this a few years ago: https://github.com/rafaqz/Rasters.jl/blob/netcdf_crs/src/sources/ncdatasets.jl#L407-L508

But really something like that should live here or in a package similar to CFTime.jl - like CFCRS.jl... that CommonDataModel.jl could also depend on.

This is a good guideline for an implementation: https://trac.osgeo.org/gdal/wiki/NetCDF_ProjectionTestingStatus

(notice even GDAL has an incomplete implementation - but it should take care of 99.9% of the use cases)

One stop-gap solution is to add an extension for ArchGDAL here and just call it on the files when GeoInterface.crs is called on any AbstractVariable.

Alexander-Barth commented 9 months ago

Yes, I agree that this would be very useful.

I have a slight preference for a separate package ("light on dependencies") like CFCRS.jl or CFCoordinateReferenceSystem.jl... that CommonDataModel.jl could depend on (either a normal or weak dependency). I am not too familiar with coordinate systems, so I guess that a small package with a narrow focus would be easier for other users to contribute to.

As far as I know, Well Know Text is an OGC standard and we have already the attribute crs_wkt in the CF conventions. I would assume that this should have a priority.

In TIFFDatasets.jl there is also some code that could be useful, where ArchGDAL is used to make the projection information accessible using the CF convention:

https://github.com/Alexander-Barth/TIFFDatasets.jl/blob/c9205ae46b24f1e8dbe8f774dd44823f868c1bf3/src/TIFFDatasets.jl#L72

If such a TIFF file is loaded using TIFFDatasets.jl, there is this "pseudo" variable crs with the following attribute created.

[...]
  crs
    Attributes:
     grid_mapping_name    = transverse_mercator
     longitude_of_central_meridian = 105.0
     false_easting        = 500000.0
     false_northing       = 1.0e7
     latitude_of_projection_origin = 0.0
     scale_factor_at_central_meridian = 0.9996
     longitude_of_prime_meridian = 0.0
     semi_major_axis      = 6.378137e6
     inverse_flattening   = 298.257223563
     crs_wkt              = PROJCS["WGS 84 / UTM zone 48S",GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",105],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",10000000],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","32748"]]
     GeoTransform         = 706740.0 10.0 0.0 9.34096e6 0.0 -10.0

  band1   (256 × 256)
    Datatype:    Float32
    Dimensions:  cols × rows
    Attributes:
     grid_mapping         = crs

(so far only transverse_mercator is recognized :-)) This variable is linked to the data variable band1 via the grid_mapping attribute.

It would be quite helpful for me if somebody can show me how the WKT string can be used to initialise a GeoFormatTypes.CoordinateReferenceSystemFormat object returned by GeoInterface.crs (using ArchGDAL or not) for a raster and then computing the coordinates x/y (or lon/lat) for a given pixel of the raster.

I am a bit confused how this works in GeoArrays currently:

using GeoInterface, GeoFormatTypes, GeoArrays
fn = download("https://github.com/yeesian/ArchGDALDatasets/blob/master/data/utmsmall.tif?raw=true");
geoarray = GeoArrays.read(fn);
GeoInterface.crs(geoarray) isa GeoFormatTypes.CoordinateReferenceSystemFormat
# returns false
# expected true from the docs of GeoInterface.crs

Maybe @evetion has an idea what I am doing wrong?

evetion commented 9 months ago

Can't check it myself right now, but it might just return nothing?

Alexander-Barth commented 9 months ago

In fact, GeoInterface.crs retuns a WellKnownText for this example:

julia> GeoInterface.crs(geoarray)
WellKnownText{GeoFormatTypes.CRS}(GeoFormatTypes.CRS(), "PROJCS[\"NAD27 / UTM zone 11N\",GEOGCS[\"NAD27\",DATUM[\"North_American_Datum_1927\",SPHEROID[\"Clarke 1866\",6378206.4,294.978698213898,AUTHORITY[\"EPSG\",\"7008\"]],AUTHORITY[\"EPSG\",\"6267\"]],PRIMEM[\"Greenwich\",0],UNIT[\"degree\",0.0174532925199433,AUTHORITY[\"EPSG\",\"9122\"]],AUTHORITY[\"EPSG\",\"4267\"]],PROJECTION[\"Transverse_Mercator\"],PARAMETER[\"latitude_of_origin\",0],PARAMETER[\"central_meridian\",-117],PARAMETER[\"scale_factor\",0.9996],PARAMETER[\"false_easting\",500000],PARAMETER[\"false_northing\",0],UNIT[\"metre\",1,AUTHORITY[\"EPSG\",\"9001\"]],AXIS[\"Easting\",EAST],AXIS[\"Northing\",NORTH],AUTHORITY[\"EPSG\",\"26711\"]]")
rafaqz commented 9 months ago

Ok the confusion is that WellKnownText <: MixedFormat because CRS and Geometry can be represented in the same format.

Mostly that doesnt matter, we mark it as MixedFormat{CRS} when we know thats what it is, which is nearly always.

asinghvi17 commented 1 month ago

According to the CF conventions, crs_wkt is a fallback, and absolute priority goes to CF attributes if they exist. If not, then one checks crs_wkt, spatial_ref, proj4text, spatial_epsg, ...