Closed benbovy closed 1 month ago
assign_centers
or assign_centroids
?
assign_boundaries
or assign_polygons
? ("boundaries" may rather mean only the boundaries, not the whole polygon entity)
Example with H3: https://medium.com/@jesse.b.nestler/how-to-convert-h3-cell-boundaries-to-shapely-polygons-in-python-f7558add2f63 (not sure it uses shapely's 2.0 vectorized geometry creation functions, though).
After thinking more about it, I would lean towards Dataset.dggs.extract_*
methods that would return one or more DataArrays. This is more flexible/composable. For assigning it as new coordinates, we could just do:
ds.assign_coords(
cell_center=ds.dggs.extract_cell_centroids(),
cell_bounds=ds.dggs.extract_cell_boundaries(),
)
if we go with that API, you could even remove the extract_
prefix. The only question that would raise is if they should always recompute based on the cell ids, or whether it would make sense to return the appropriate variable if it exists on the object.
As for cell_centroids
/ cell_centers
: do you know if there's a way to go from Point
variables to separate lat / lon coordinates and back in xvec
?
you could even remove the extract_ prefix
Agreed.
The only question that would raise is if they should always recompute based on the cell ids, or whether it would make sense to return the appropriate variable if it exists on the object.
I think we should always recompute. It is better letting users figure out whether they need to call ds.dggs.cell_centroids()
or get the coordinate with ds.cell_centroids
or whatever the actual name of that coordinate is if it exists.
As for cell_centroids / cell_centers: do you know if there's a way to go from Point variables to separate lat / lon coordinates and back in xvec?
I haven't checked that, but I assume that most DGGS backends deal with lat/lon coordinates directly instead of shapely Point, so supporting only the latter would mean a wasteful creation of shapely objects + extraction of coordinates for those who do not need shapely.
oh, I was not suggesting to drop the existing computation of lat / lon values from cell ids, just wondering if given a point variable you'd be able to transform to lat / lon variables.
Ah yes, not sure if there's a specific function in xvec, but using shapely.get_coordinates(...)
might be enough?
I'd imagine we'd be able to extend xvec
to support that.
how much should we care about the different terminologies? Like, we've been talking about "cells" and "cell ids", but the OGC working group tends to talk about "zones" and "zonal ids". We could in theory make one an alias (or a thin wrapper) of the other, but we could also just follow H3 and stick with "cells".
In any case, I think having a bunch of cell_*
(or zone_*
) properties (or methods, should we want to follow the "properties for quantities that are cheap to recompute, methods for more expensive ones") might make sense, for example: ds.dggs.cell_ids
, ds.dggs.cell_boundaries
, ds.dggs.cell_centers
. The advantage of having attributes with a similar prefix would be mostly tab completion.
should have been closed by #30, which puts the polygons into shapely
arrays. For using spherely
I believe we should open a separate issue.
We could add a
Dataset.dggs.assign_boundaries()
that would create and assign a new coordinate ofshapely.Polygon
objects (and/or laterspherely.Polygon
objects when it is ready) of grid cell boundaries from the cell ids indexed coordinate.This would be a nice complement to
Dataset.dggs.assign_latlon_coords()
(that we could maybe rename toassign_centers_latlon()
for consistency). We could also have aDataset.dggs.assign_centers()
that would createshapely.Point
(orspherely.Point
) objects.This would be useful for interoperability with xvec, for plotting, etc.