Open martin-rdz opened 1 year ago
I like this idea. Are you planning on replacing all the reader functions or simply convert the data_container to xarray?
Not quite sure yet.
The standard-reader works quite well, but has problems for data that is not ['time', 'range']
.
E.g. for disdrometer spectra (['time', 'size']
) or mwr TBs (['time', 'channel']
).
Also there are quite some powerful merge functions in xarray, that could - at least partially - replace Transformations.join
.
For the binary readers probably the best option is to read and then transform.
A nice feature would be to have both interfaces in parallel, i.e., just a keyword selecting what dataformat should be provided.
I'm already working with the xarray merging functions for multiple datasets. This works flawless as long as you know how to set the parameters for this function correctly for the different dimensions and variables in the nc-files.
The scientific python ecosystem continues to evolve and xarray meanwhile seems the standard for labeled numpy arrays.
To make the code more interoperable and improve beginners experience it would be beneficial to replace lardas
data_container
with the xarray DataArray structure.One could also leverage the data merging functions included in xarray, e.g., https://docs.xarray.dev/en/stable/user-guide/combining.html The 'new' plotting functions introduced by @KarlJohnsonnn should already allow both inputs.
Major issue, that requires some thought beforehand is the serialization for the remote sources, but that should be solvable. Most convenient would be some kind of direct option. Alternatively one could adapt the current approach, where the container is deconstructed, serialized and reconstructed. Link to the respective code
Any thoughts @KarlJohnsonnn @ti-vo @ulysses78?