Open TomNicholas opened 4 years ago
One additional thing I've realized is needed since writing https://github.com/hgrecco/pint/issues/849#issuecomment-579992247 is a helper for coordinate unit conversion. Not sure what the best name for this would be though. In MetPy, the tentative name is .convert_coordinate_units
, but that really only makes sense since the MetPy accessor's version of .to
is .convert_units
. Maybe .coordinate_to
?
See https://github.com/Unidata/MetPy/pull/1325 / https://github.com/Unidata/MetPy/compare/v0.12.0...jthielen:0-12-patch-xarray-0-15-1?expand=1
we could also provide work-arounds for operations like rolling
or ffill
that, due the functions and libraries they use (i.e. bottleneck
, numbagg
, scipy
, numpy.lib
, etc., but also numpy.vectorize
), won't be able to support duck arrays in the near future (though we should probably wait until there is a final decision on that, right now these methods were simply postponed).
a helper for coordinate unit conversion.
So a da.pint.coords_to
, that gets called within da.pint.to
?
Also what about UnitRegistry().wraps
? Do we need a xarray equivalent for that? Would that help with providing workarounds for rolling
etc?
a helper for coordinate unit conversion.
So a
da.pint.coords_to
, that gets called withinda.pint.to
?
Not quite, it would be separate. I'm thinking of something to change the units on a coordinate of da
without changing anything else on da
, which needs to follow different logic given that Indexes are immutable, and, for now, not supporting units directly...so something like
def coord_to(self, coord, units):
new_coord_var = self.da[coord].copy(
data=Quantity(self.da[coord].values, self.da[coord].attrs.get('units')).m_as(units)
)
new_coord_var.attrs['units'] = str(units)
return self.da.assign_coords(coords={coord: new_coord_var})
how about also accepting a dict
/ kwargs dict (like assign
/ assign_coords
) to DataArray.pint.to
and then using different branches for data and coordinates? That way, converting the data and multiple coordinates to (different) units would be possible with one function call.
Obviously, we need something really similar for Dataset.pint.to
, so we could implement it by using _to_temp_dataset
and to_dataarray
.
how about also accepting a
dict
/ kwargs dict (likeassign
/assign_coords
) toDataArray.pint.to
and then using different branches for data and coordinates? That way, converting the data and multiple coordinates to (different) units would be possible with one function call.
So long as we don't lose the easy ability to just convert data units (e.g., da.pint.to('degC')
), that sounds great.
So as a to-do list, that would mean one positional arg for the data and multiple keyword args for the coords in da.pint.to()
, then all keyword args for ds.pint.to()
:
da.pint.to(unit)
- convert datada.pint.to(new_unit1, coord1=new_unit2, coord2=new_unit3, ...)
- convert data and multiple coords simultaneouslyds.pint.to(var1=new_unit1, var2=new_unit2, coord1=new_unit3, ...)
- convert multiple data vars and coords simultaneouslyAgreed for points 1 (DataArray
only?) and 3 (Dataset
only). For 2 there might be a few more options (nothing wrong with this one), but I think we should discuss them in a separate issue / when we implement it to keep this issue as a high level API overview.
something else that is not a part of the API but in scope for this package (I think?): maybe we could monkeypatch the repr
of DataArray
, Variable
and Dataset
(or register some sort of repr function for pint
arrays (via entrypoints? on import?)) to make it more readable (i.e. make sure the unit is always visible). See also pydata/xarray#2773
I definitely think that's in scope for this package, at least until some more general solution for reprs of all wrapped arrays is implemented upstream.
Monkey-patching actually seems like a reasonable temporary solution here I think? The worst that can happen is that some other method which uses the repr fails to display the units right
I definitely agree that some kind of repr fix is in order. The current default is...less than ideal...
from #11:
pint_xarray.testing
: for testing purposespint_xarray.conversions
)if we can get the reprs to work (pydata/xarray#2773, hgrecco/pint#1133), improve the documentation and maybe also expose the pint_xarray.testing
module, we should be close enough to release a initial development version (0.1
?)
the HTML repr
works (pint
got released today), and there are #20 (docs), #22 (inline repr) and #24 (testing), so once those are merged and the placeholder functions we have right now are temporarily removed, we should be able to release 0.1
the HTML
repr
works (pint
got released today), and there are #20 (docs), #22 (inline repr) and #24 (testing), so once those are merged and the placeholder functions we have right now are temporarily removed, we should be able to release 0.1
Sounds great! Right now though it looks like the only placeholders are sel
and loc
. If you're good with waiting a few days, I should be able to get around to adding those in time for 0.1...I should be able to implement them the same way I did in MetPy.
I was thinking of plus_minus
, to_base_units
and to_system
, which don't have docstrings, work only on the data or raise NotImplementedError
.
Edit: let's continue this in #25
from #61: try to dequantify
automatically before plotting (see xarray.plot.plot.label_from_attrs
)
@jthielen proposed a rough accessor API in pint/#849, to which I've added a couple of things:
DataArray:
da.pint.to(...)
: return dataarray with converted units (#11)da.pint.to_base_units()
: return dataarray with base unitsda.pint.units
: units of quantity (as a Unit)da.pint.magnitude
: magnitude of quantityda.pint.quantify(unit_registry=None, unit=None)
: create DataArray wrapping a Quantity based on string unit attribute of DataArray or specified unitda.pint.dequantify()
: replace data with the Quantity's magnitude, and add back string unit attribute from Quantity's unitda.pint.sel()
: wrap da.sel to handle indexing with Quantities (by casting to magnitude in the coordinate's units similar to how MetPy does it, since true unit-aware indexing is not available yet in xarray)da.pint.loc
: wrap da.loc likewiseda.pint.to_system(system)
: convert all variables to be expressed in the (base?) units of a different system. Might require upstream additions to pint.Dataset:
ds.pint.to(...)
: return Dataset with converted units (#11)ds.pint.to_base_units()
: return dataset with base unitsds.pint.quantify(unit_registry=None)
: convert all data variables to quantitiesds.pint.dequantify()
: convert all data variables from quantities to magnitudes with units as an attributeds.pint.sel()
: wrap ds.sel to handle indexing with Quantitiesds.pint.loc
: wrap ds.loc likewiseds.pint.to_system(system)
: convert all variables to be expressed in the (base?) units of a different system.Anything else?
At some point when the integration is more solidified (but before official release) we should change the accessor from
pint
tounits
, to get a interface more like what's described here. This would be: a) More intuitive b) Units-library agnostic c) A good fit for potentially using an entrypoint to choose which units library you want to use. There's already an entrypoint for plotting backends in xarray, and plans to add one for storage backends too.