SlideRuleEarth / sliderule-python

SlideRule Earth Example Notebooks: On-demand, cloud-based processing of satellite mission data (NASA ICESat-2, GEDI, ArcticDEM/REMA, HLS)
https://slideruleearth.io/rtd/
BSD 3-Clause "New" or "Revised" License
41 stars 21 forks source link

icesat2.toregion() refactoring and Polygon handling #41

Closed dshean closed 2 years ago

dshean commented 3 years ago

We need to revisit our Polygon handling at some point. We are inevitably going to run into issues as users bring their own files and Polygons. This is not a top priority, but here are some notes/thoughts, can discuss at next tagup.

At present, the user can manually define coordinates to create a region object in expected SlideRule format or they can input a filename from disk (with either geojson or shp format, not the full suite of file types supported by fiona/geopandas).

If a user has already opened the file, or has prepared a GeoDataFrame or Polygon geometry earlier in a notebook, there is no good way for them to submit a Polygon for a SlideRule query. Some refactoring of the icesat2.toregion() code could accommodate this, perhaps with 3 functions (file_toregion, gdf_toregion, geom_toregion) that can be called depending on the available input. Or a single function with logic to recognize the input type.

We should check that the CRS of the input coordinates is our expected EPSG:4326 lat/lon. Easy if input is file or GeoDataFrame, more complicated when input is geometry object.

We should support a rectangular bbox provided by geometry.envelope or gdf.total_bounds

I think we may want to use Shapely Polygon objects as our fundamental object (no matter what user provides, we end up with a single Polygon, then convert to SlideRule region). We are already relying on these objects in toregion, but we can clean up and isolate from the file opening code.

Winding can be determined polygon.exterior.is_ccw and orientation can be cleaned with shapely.geometry.polygon.orient() (https://shapely.readthedocs.io/en/stable/manual.html#shapely.geometry.polygon.orient). We can take advantage of other Polygon checks for closure, validity, simplicity, etc. Not sure if the buffer(0.0) trick, which we use in toregion, will take care of all of this.

When we have a Shapely geometry that we trust, we can then just have a Polygon_toregion() function to prepare the required region object for SlideRule. This really only requires one line:

region = [{"lon": coord[0], "lat": coord[1]} for coord in list(geom.exterior.coords)]

In our earlier discussions around these issues, we got bogged down with issues of complex geometry objects with interior holes and multipolygons. I think we want to limit the SlideRule query to the exterior ring of a single, simple Polygon geometry object. May require a unary_union operation if input is a GeoDataFrame. The user can do more complex filtering/masking with the returned GeoDataFrame if desired.

jpswinski commented 2 years ago

The toregion api has been updated to support a number (if not all) of the use cases described above. The only thing missing is that we do not yet support a simple bbox query. You have to supply it as a polygon.

Here are the features implemented: