Closed benbovy closed 7 months ago
not sure about regridding, but I'm not sure if we should really use from_*
for this: to me, those sound very much like factory methods. What do think about using a verb, for example ds.dggs.create_index(dggs_coord, **kwargs)
?
I'm not sure if we should really use from_* for this: to me, those sound very much like factory methods.
I actually view those methods very much like "factory" methods! Not in the sense of class factory methods but rather in the sense of creating a Dataset or DataArray that can be handled as a DGGS one from any other Dataset / DataArray that cannot. Maybe conversion methods is a better term, for which the from_
/ to_
prefixes work pretty well I think.
We could have a ds.dggs.to_latlon_grid
for conversion from DGGS to a rectilinear grid.
We could also add later ds.dggs.from_latlon_ugrid
to convert from an unstructured mesh.
create_index
sounds a bit too restrictive IMO. It is a good name for creation of a DGGS directly from an existing cell ids coordinate but not for the order cases that involve more operations than creating an index.
What I meant with from_
being factory methods is that they sound like they would construct a new object of a different python type, while in this case they return a modified copy.
I do think we need to be able to properly regrid, but in that case the main task is to actually regrid, not setting the index. What I mean with that is that I'd rather have regrid
in the name of the API (which might be a accessor method, or a xesmf
-like regridder object), where the final step after the actual regridding is to create the index.
As for ds.dggs.create_index
specifically, I do think this kind of restriction makes for a cleaner API (though as always when designing a API I can't be certain until we actually use it).
What I meant with from_ being factory methods is that they sound like they would construct a new object of a different python type, while in this case they return a modified copy.
Understood! But _from_
and _to_
naming patterns are quite usual for conversion methods, no? Let me list a few options:
# 1. accessor conversion methods (the accessor name serves itself as prefix)
ds.dggs.from_(...)
ds.dggs.to_(...)
# 2. accessor conversion methods
ds.dggs.dggs_from_(...)
ds.dggs.dggs_to_(...)
# 3. top-level conversion functions (the package name serves as prefix)
xdggs.from_(ds, ...)
xdggs.to_(ds, ...)
# 4. top-level conversion functions
xdggs.dggs_from_(ds, ...)
xdggs.dggs_to_(ds, ...)
2 and 4 look too verbose/repetitive to me. Either 1 or 3 are fine by me. Both xdggs
and ds.dggs
don't return a class so it doesn't really bother me using from_
prefix here for something that is not a factory method.
I get your point regarding the name vs. the main task... But here I think that from the user point of view the task is best described as converting a Dataset or DataArray from one form (raster, mesh, etc.) from/to a DGGS form. Regridding, resampling, index creation, etc. are important but implementation details.
ds.dggs.from_cell_ids
is a special case. It is just an alias but I think it is worth it for discoverability. Having a ds.dggs.create_index
wouldn't add much value to ds.set_xindex(..., DGGSIndex)
.
One option could be reusing the
.dggs
accessor with the following methods (non-exhaustive):