CLIMADA-project / climada_python

Python (3.8+) version of CLIMADA
GNU General Public License v3.0
291 stars 115 forks source link

Add function to store (grid) data as NetCDF #905

Open peanutfun opened 3 days ago

peanutfun commented 3 days ago

Early versions of #898 #857 provided functions to store the exceedance map data as NetCDF files. We decided to remove this feature from the PR(s).

@DahyannAraya has also been working on NetCDF output of Impacts. It might be worthwhile to consolidate these efforts.

Minimal Viable Feature Provide a function that "rasterizes" a GeoDataFrame into an xarray DataArray or Dataset, and writes its data into a NetCDF file. Throw an error if the data cannot be rasterized (apparently is not gridded).

bguillod commented 3 days ago

This would be useful feature indeed.

My suggestion would be to write method to convert data (exceedance, hazard, impact, or whatever object is applies to) to and from xarray (with fixed conventions for each class) and let xarray deal with the reading/writing of the file.

More specifically: I suggest there are methods to_xarray() and from_xarray (or similar) only. The user can then read/write that xarray using xarray itself (e.g., xarray.Dataset.to_netcdf or xarray.Dataset.to_hdf5 or xarray.Dataset.to_zarr, xarray.open_dataset, ...). This would leave the user the option to store data in its preferred format.

peanutfun commented 3 days ago

@bguillod Thanks for the input. Indeed, xarray and its extensive functionality for storing and loading data seems to be the natural choice for that. As became apparent in #898, it is feasible to return computed data as GeoDataFrame. As Climada already uses this data structure extensively, we can start with a "conversion" from GeoDataFrame to xarray DataArray or Dataset and should cover most cases with that.

To consolidate any efforts already undertaken: Do you, by any chance, already have a code available that brings any Climada data structure into an xarray structure?