gjoseph92 / stackstac

Turn a STAC catalog into a dask-based xarray
https://stackstac.readthedocs.io
MIT License
238 stars 49 forks source link

Support `proj:wkt2` if `proj:epsg` is null #67

Open matthewhanson opened 3 years ago

matthewhanson commented 3 years ago

As an example, see this Landsat Albers scene from the Landsat Collection 2 STAC API:

https://ibhoyw8md9.execute-api.us-west-2.amazonaws.com/prod/collections/landsat-c2l2alb-sr/items/LC08_L2SR_082024_20210806_20210811_02_A1_SR

I hacked stackstac to accept any CRS string in the epsg keyword rather than requiring an EPSG code which is basically just handing it along here: https://github.com/gjoseph92/stackstac/blob/c431e091690d6b8af7eaba544af13bf8ebf5ebe6/stackstac/prepare.py#L515

Would be good to follow the same logic here that the new GDAL STAC Item driver does: https://github.com/OSGeo/gdal/pull/4138

which allows for: image

Rather than drop the epsg code may be better to add deprecation warning, and add a new crs keyword. When parsing the STAC it can then construct the right CRS based on what is available for the Item/Asset.

@gjoseph92 I'm happy to issue a PR for this, lmk if I should, don't want to duplicate work if you had already been working on this.

gjoseph92 commented 3 years ago

I was intentional to only use epsg at first, because I really didn't want to get into the mess of CRS representations right out of the gate. I was hoping the rest of the community might come to consensus around how to represent CRSs in the xarray data model (see https://github.com/gjoseph92/stackstac/issues/50), but that still hasn't happened. The rioxarray model is complete but cumbersome and hard to understand for users. A plain crs field that can hold any type (EPSG code, proj string, wkt, wkt2, projjson, etc.) is probably where we'll end up, though I'm a little sad about it. I think xarray flexible indexes will be the real answer.

I think there are a number of places where we'd have to change logic to switch epsg to crs. I think if you start in geom_utils.py, update signatures to take crs instead of epsg, and then refactor everything calling those functions, it might work.

I'm open to a PR doing this, but I'd also like to hear your thoughts on the design. CRSs are a mess that I'm hesitant to wade into without a good plan.

As I've said in other places, my extremely strong preference would be for some other library (geoxarray?) to define the data model and provide all the utility functions (like our current geom_utils.py) for working with it, then have stackstac, rioxarray, etc. just deal with the IO and generate xarrays that conform with that data model.

glaroc commented 2 years ago

This is an important issue for us. Some reference systems that are frequently used are not defined by an EPSG number such as all those defined with an ESRI: code (https://spatialreference.org/ref/esri/). EPSG doesn't have, for example, an equal area projection for North or South America.

RichardScottOZ commented 2 years ago

Yes, does ESRI:102008 work? Never had cause to use this one.

glaroc commented 2 years ago

I think only EPSG codes are recognized by stackstac, so I wouldn't know how to enter an ESRI: code. Just entering 102008 doesn't work.