pydata / xarray

N-D labeled arrays and datasets in Python
https://xarray.dev
Apache License 2.0
3.62k stars 1.09k forks source link

support for units with pint #3594

Open keewis opened 4 years ago

keewis commented 4 years ago

pint's implementation of NEP-18 (see hgrecco/pint#905) is close enough so we can finally start working on the pint support (i.e. make the integration tests pass). This would be the list of tasks to get there:

dcherian commented 4 years ago

Thanks for leading this effort @keewis.

I would start with the lowest-level operations like the constructors and align, concat, merge. These are called in many of the other functions so fixing these is a prerequisite for getting the rest working. I've looked at align, concat & merge recently so can help if you need to chat about confusing error messages.

indexes strip units

What does this mean? You can't have units in a IndexVariable?

keewis commented 4 years ago

What does this mean? You can't have units in a IndexVariable?

yes, we had that discussion from https://github.com/pydata/xarray/issues/525#issuecomment-514452182 on. Short version: pd.Index converts using np.asarray and support for units probably requires #1603.

amcnicho commented 4 years ago

I will try to figure out the reason for each of these test failures, but I'd appreciate help.

Would #3643 be the best place to offer contributions at this point, or somewhere else?

keewis commented 4 years ago

I think issues related to DataArray + pint should be in #3643, for everything else you can use this issue or new issues / pull requests.

If you want to, I'd appreciate someone reviewing the tests in test_units.py since I don't think anyone other than me thoroughly looked at all of them. You could also investigate / fix the Dataset issues, investigate the reason for the UnitStrippedWarnings or start writing documentation on how to use pint in combination with xarray.

keewis commented 4 years ago

so, except from the major issues mentioned above which we won't be able to fix in the near future (but there will probably be workarounds in pint-xarray) we only have three minor issues: nanprod, support for per variable fill values and the repr (#2773).

I don't think nanprod and the fill values are particularly urgent, so if we get support for the repr (maybe using some sort of hook that a library like pint-xarray can then use to properly format the duckarray) and put together the documentation page, we could include this in the 0.16 release.

Edit: I guess the release is already big enough so I don't really mind waiting on the next release, but this is really close.

dcherian commented 2 years ago

@keewis Shall we close this? It seems the only outstanding one is nanprod with quantities. Which sort of indicates that we've made the necessary big changes.

keewis commented 2 years ago

6873 might fix the nanprod issue, and we have a separate issue for the last big change left (#3950, which is not really limited to quantities) so I agree that we should be able to close this with #6873.

We might want to open a new issue to get the known issues to work, though.