CWorthy-ocean / roms-tools

Tools for setting up and running ROMS simulations
https://roms-tools.readthedocs.io
GNU General Public License v3.0
7 stars 3 forks source link

Processing atmospheric forcing data #23

Open TomNicholas opened 1 month ago

TomNicholas commented 1 month ago

Step 4 of #1 requires accessing, subsetting, downloading, interpolating, and writing out atmospheric forcing data.

I haven't tried to do this yet so don't know all the details but @sdbachman was asking about how we might parallelize this. If the files to be processed are fully embarrassingly-parallel the simplest way is to use dask.delayed and create a list of all individual tasks to be performed. Example notebook that demonstrates that idea (for a tiny fake dataset).

Alternatively maybe we want to actually create a whole xarray dataset for all the input data and use dask via calling xarray objects on that dataset.

Ideally we could just fit the whole forcing dataset in memory but I think it's too big for that.

Either way the first step is to express the operations we need to do on the original forcing dataset in xarray code.

NoraLoose commented 1 month ago

Related to this, we have to discuss how the python package will access the ERA5 data. Should we have the user download the necessary data from ECMWF themselves separately, or do we want to provide support through our python package? @TomNicholas @sdbachman @matt-long @ubbu36