xgcm / xgcm

python package for analyzing general circulation model output data
http://xgcm.readthedocs.org
MIT License
226 stars 82 forks source link

Unit support by way of Pint (supporting xarray > Pint > dask data structures) #283

Open jthielen opened 3 years ago

jthielen commented 3 years ago

As raised in https://github.com/xgcm/xgcm/issues/222, it would be great to have some form of unit support in xgcm, and it's a feature I'm especially interested in coming from MetPy, where unit support is a central feature. The current leading library for this in the ecosystem (at least as far as integrations with other libraries go) is Pint, particularly since it fully implements (or at least tries to!) the "nested duck array" approach implied by the type casting hierarchy of NEP-13/18.

In practice, for a library like xgcm, this looks like xarray DataArrays/Datasets containing Pint Quantities, which in turn contain Dask Arrays (order chosen based off of run-time metadata and pickling concerns). Everything needed to make this happen on a basic level is in place thanks to the efforts of @keewis, @rpmanser, myself, and others, but admittedly, the integration is still in its early stages and there are bound to be bugs that have yet to be worked out. Though, I think it would be worth experimenting with it here in xgcm as a way to bring about unit support.

I'd be interested in helping with this (especially as a way to open up integrations between MetPy and xgcm), but I unfortunately don't think I'd have many free cycles to dedicate to this in the near future, so someone else please feel free to take this on!

Related Issues in Upstream/Parallel Libraries https://github.com/hgrecco/pint/issues/883 https://github.com/pydata/xarray/issues/4208 https://github.com/Unidata/MetPy/issues/1479

jbusecke commented 3 years ago

Thanks for the great summary @jthielen. I think this would be a worthwhile addition IF (like @rabernat mentioned in #222) we do not sacrifice performance.

github-actions[bot] commented 3 years ago

This issue has been marked 'stale' due to lack of recent activity. If there is no further activity, the issue will be closed in another 30 days. Thank you for your contribution!

jthielen commented 3 years ago

This issue is still relevant. It is something I'd like to investigate later this summer when I hopefully have more time available (I'd really like to explore application areas of xgcm with NWP models and use alongside MetPy).

jbusecke commented 3 years ago

That would be a really amazing contribution @jthielen, I am really excited to see what you come up with.

I marked the issue with keepOpen to keep the bot away.