fsspec / GSoC-kechunk-2022

MIT License
2 stars 2 forks source link

Create pangeo-forge recipes #10

Closed rsignell-usgs closed 2 years ago

rsignell-usgs commented 2 years ago

@peterm790 , I think it would be useful to create pangeo-forge recipes for ERA5 and other datasets.

I didn't really see the point of using pangeo-forge for our workflows up until now, because:

But @sharkinsspatial convinced me at SciPy2022 that even if we didn't need pangeo-forge there was value in capturing the workflow for others using pangeo-forge formalism. And they are apparently working on the updating part.

At the scipy sprint on pangeo-forge, I created my first recipe, following the example here: https://pangeo-forge.readthedocs.io/en/latest/pangeo_forge_recipes/tutorials/hdf_reference/reference_cmip6.html

@martindurant, two questions:

martindurant commented 2 years ago

do you think this is a good idea?

It has always been the long term plan to be able to at least interoprate smoothly with pangeo-forge. Also, I imagine pangeo-forge being a place that produces kerchunk output as a side effect even when the recipe authors actually want zarr. However, kerchunk is still in the experimental phase, so while the API might yet change significantly, it seems like extra effort to maintain a set of workflows that depend on it. Whether that maintenance is easier within kerchunk or in pangeo-forge is a matter of opinion.

should I submit an issue to create a GRIB2ReferenceRecipe method, or should it be for a more generic ReferenceRecipe method

That's a good question. I have started to put some effort into making the APIs of the backends similar, and that's probably the right way to go eventually. However, each backend will take a set of unique arguments appropriate for the format, so any unified recipe will have the equivalent of an unschema-ed **kwargs. So far, pangeo-forge has separate classes for different types of input, so I would err on that side.

rsignell-usgs commented 2 years ago

Let's leave pangeo-forge aside for now then, and I'll raise an issue to create GRIB2ReferenceRecipe.