CI | |
---|---|
Docs | |
Package | |
License |
Datatree is a prototype implementation of a tree-like hierarchical data structure for xarray.
Datatree was born after the xarray team recognised a need for a new hierarchical data structure,
that was more flexible than a single xarray.Dataset
object.
The initial motivation was to represent netCDF files / Zarr stores with multiple nested groups in a single in-memory object,
but datatree.DataTree
objects have many other uses.
This repository has been archived and the code is no longer maintained!
Datatree has been merged upstream into pydata/xarray
, and released as of xarray version 2024.10.0
.
There will be no further bugfixes or feature additions to this respository.
Users of this repository should migrate to using xarray.DataTree
instead, following the Migration Guide.
The information below is all outdated, and is left only for historical interest.
You can install datatree via pip:
pip install xarray-datatree
or via conda-forge
conda install -c conda-forge xarray-datatree
You might want to use datatree for:
Talk slides on Datatree from AMS-python 2023
The approach used here is based on benbovy's DatasetNode
example - the basic idea is that each tree node wraps a up to a single xarray.Dataset
. The differences are that this effort:
xarray.Dataset
's API over every node in the tree (such as .isel
),You can create a DataTree
object in 3 ways:
1) Load from a netCDF file (or Zarr store) that has groups via open_datatree()
.
2) Using the init method of DataTree
, which creates an individual node.
You can then specify the nodes' relationships to one other, either by setting .parent
and .children
attributes,
or through __get/setitem__
access, e.g. dt['path/to/node'] = DataTree()
.
3) Create a tree from a dictionary of paths to datasets using DataTree.from_dict()
.
Datatree currently lives in a separate repository to the main xarray package. This allows the datatree developers to make changes to it, experiment, and improve it faster.
Eventually we plan to fully integrate datatree upstream into xarray's main codebase, at which point the github.com/xarray-contrib/datatree repository will be archived.
This should not cause much disruption to code that depends on datatree - you will likely only have to change the import line (i.e. from from datatree import DataTree
to from xarray import DataTree
).
However, until this full integration occurs, datatree's API should not be considered to have the same level of stability as xarray's.
We really really really want to hear your opinions on datatree! At this point in development, user feedback is critical to help us create something that will suit everyone's needs. Please raise any thoughts, issues, suggestions or bugs, no matter how small or large, on the github issue tracker.