SlideRuleEarth / sliderule

Server and client framework for on-demand science data processing in the cloud
https://slideruleearth.io
Other
27 stars 12 forks source link

sliderule: update to EPSG:7912, use 3D Point geometry #272

Closed dshean closed 1 year ago

dshean commented 1 year ago

Addresses some of the issues in https://github.com/ICESat2-SlideRule/sliderule/issues/198

After some experimentation with latest PROJ (9.2.1) and GeoPandas, it appears that if we include 3D Point Geometries, we can do the 3D CRS transformations using the to_crs() functionality, since we now assign EPSG:7912 (3D CRS for ITRF2014 realization).

This PR updates the default CRS for the Python client output (was inconsistent with what is returned by the server-side GeoRaster code). Also updated a few comments (EPSG:4326 is not MERCATOR), and fixed a typo.

There may be some unforeseen gotchas with the 3D Points, and we are preserving a redundant copy of the h_mean attribute as the Z coordinate in the Point, but I think this is acceptable. We don't want to drop the h_mean column. An alternative would be to allow the user to specify whether they want a 2D or 3D Point geometry column for the GeoDataFrame.

dshean commented 1 year ago

I may have missed other instances where the CRS is specified. I see the following: https://github.com/search?q=repo%3AICESat2-SlideRule%2Fsliderule+SLIDERULE_EPSG&type=code

dshean commented 1 year ago

OK, tests failed. I assumed h_mean would be in the columns

jpswinski commented 1 year ago

@dshean I made the necessary updates to the Python client so that the elevation column of each of the datasets could be used.

But, I am seeing that the runs are taking much longer. A 30 second ATL06-SR request, with the 3D transformation, takes a 150 seconds. I think there may be a lot of use cases (like the web-client), where the added time may not be needed and will artificially slow things down. What do you think?

My recommendation is to have this be a user-enabled feature. When the user specifies the height_key in their request, it will trigger the client using it to perform the 3D transformation. That is the behavior as of the last commit.

scottyhq commented 1 year ago

If there is a significant performance hit that might merit keeping the 2D geometry by default. Here is a link to some demos doing client-side promotion to 3D, which works well for small datasets, but probably not for large geoparquet outputs (https://github.com/ICESAT-2HackWeek/3D_CRS_Transformation_Resources/blob/main/geopandas3D.ipynb). for the api, matching geopandas and shapely defaults where possible makes sense to me. For example, maybe sliderule requests set include_z=False by default but setting to True triggers 3D geometries returned from the server (https://shapely.readthedocs.io/en/stable/reference/shapely.get_coordinates.html)