To apply a function to all control or all plus4K nodes is straight forward by just selecting the specific subtree, e.g. dt['control']. However, in case all lowRes dataset should be manipulated this becomes more elaborative and I wonder what the best approach would be.
* `dt['control/lowRes','plus4K/lowRes']` is not yet implemented and would also be complex for large data trees
* `dt['*/lowRes']` could be one idea to make the subtree selection more straight forward, where `*` is a wildcard
* `dt.search(regex)` could make this even more general
Currently, I use the @map_over_subtree decorator, which also has some limitations as the function does not know its tree origin (as noted in the code) and it needs to be inferred from the dataset itself, which is sometimes possible (here the length of the dataset) but does not need to be always the case.
I do not know how the tree information could be passed through the decorator, but maybe it is okay if the DatasetView class has an additional property (e.g. _path) that could be filled with dt.path during the call of DatasetView._from_node()?. This would lead to
@map_over_subtree
def resolution_specific_func(ds):
if 'lowRes' in ds._path:
ds = ds.z*2
if 'highRes' in ds._path:
ds = ds.z*4
return ds
and would allow for tree-aware manipulation of the datasets.
What do you think? Happy to open a PR if this makes sense.
What is your issue?
Originally posted by @observingClouds in https://github.com/xarray-contrib/datatree/issues/254#issue-1835784457