I want to share with you an idea I am working on. I have a collection of tasks/sweeps of weather radar data that are not homogenous in size/shape. Radar data structure depends on radar parameters such as maximum range, angle elevation of the antenna, and bin size, among others. Unstructured data is complex to archive, especially when we want to concatenate measurements over time, as matrices are different in shape data that cannot be stacked. Therefore, a tree-like hierarchical structure is a reliable/possible way to handle this unstructured data.
I've created a xarray datatree object where the parent is the radar name, and every node corresponds to a volume coverage pattern (VCP). A VCP is shaped of multiple tasks/sweeps, and each sweep has different radar data, such as radar reflectivity, Kdp, and Zdr, among other variables. The xarray datatree object looks like this
Usually, a radar scan strategy or VCP doesn't take longer than 5 minutes; however, it might change depending on the institution/radar operator. The idea under this xarray datatree object is to efficiently access radar data by naming each node with a timestamp that represents the 5-min VCP. Then, each node contains multiple tasks, sweeps, and variables. More detailed information can be found here
As we can see, the datatree object has time coordinates. However, when I try dt.isel(time=0) I got the following error
ValueError Traceback (most recent call last)
Cell In[64], line 1
----> 1 dt.isel(time=0)
File ~/miniconda3/envs/xradar/lib/python3.9/site-packages/datatree/mapping.py:208, in map_over_subtree.<locals>._map_over_subtree(*args, **kwargs)
196 node_kwargs_as_datasets = dict(
197 zip(
198 [k for k in kwargs_as_tree_length_iterables.keys()],
(...)
203 )
204 )
206 # Now we can call func on the data in this particular set of corresponding nodes
207 results = (
--> 208 func(*node_args_as_datasets, **node_kwargs_as_datasets)
209 if not node_of_first_tree.is_empty
210 else None
211 )
213 # TODO implement mapping over multiple trees in-place using if conditions from here on?
214 out_data_objects[node_of_first_tree.path] = results
File ~/miniconda3/envs/xradar/lib/python3.9/site-packages/xarray/core/dataset.py:2431, in Dataset.isel(self, indexers, drop, missing_dims, **indexers_kwargs)
2427 return self._isel_fancy(indexers, drop=drop, missing_dims=missing_dims)
2429 # Much faster algorithm for when all indexers are ints, slices, one-dimensional
2430 # lists, or zero or one-dimensional np.ndarray's
-> 2431 indexers = drop_dims_from_indexers(indexers, self.dims, missing_dims)
2433 variables = {}
2434 dims: dict[Hashable, int] = {}
File ~/miniconda3/envs/xradar/lib/python3.9/site-packages/xarray/core/utils.py:859, in drop_dims_from_indexers(indexers, dims, missing_dims)
857 invalid = indexers.keys() - set(dims)
858 if invalid:
--> 859 raise ValueError(
860 f"Dimensions {invalid} do not exist. Expected one or more of {dims}"
861 )
863 return indexers
865 elif missing_dims == "warn":
866 # don't modify input
ValueError: Dimensions {'time'} do not exist. Expected one or more of Frozen({})
I wonder if xarray datatree objects can be sliced by using similar methods as in Xarray dataset, e.g., dt.sel(time=slice('202304070300', '202304070400'), such as it will return the nodes/subtrees I am interested in.
Hi everyone,
I want to share with you an idea I am working on. I have a collection of tasks/sweeps of weather radar data that are not homogenous in size/shape. Radar data structure depends on radar parameters such as maximum range, angle elevation of the antenna, and bin size, among others. Unstructured data is complex to archive, especially when we want to concatenate measurements over time, as matrices are different in shape data that cannot be stacked. Therefore, a tree-like hierarchical structure is a reliable/possible way to handle this unstructured data.
I've created a xarray datatree object where the parent is the radar name, and every node corresponds to a volume coverage pattern (VCP). A VCP is shaped of multiple tasks/sweeps, and each sweep has different radar data, such as radar reflectivity, Kdp, and Zdr, among other variables. The xarray datatree object looks like this
If we look in a more condensed way, the groups within the xarray datratree object look like this,
Usually, a radar scan strategy or VCP doesn't take longer than 5 minutes; however, it might change depending on the institution/radar operator. The idea under this xarray datatree object is to efficiently access radar data by naming each node with a timestamp that represents the 5-min VCP. Then, each node contains multiple tasks, sweeps, and variables. More detailed information can be found here
As we can see, the datatree object has time coordinates. However, when I try
dt.isel(time=0)
I got the following errorI wonder if xarray datatree objects can be sliced by using similar methods as in Xarray dataset, e.g.,
dt.sel(time=slice('202304070300', '202304070400')
, such as it will return the nodes/subtrees I am interested in.Please let me know your thoughts and comments.
Cheers,
Alfonso