pangeo-forge / cmip6-feedstock

A Pangeo Forge Feedstock for cmip6.
Apache License 2.0
3 stars 2 forks source link

Slow recipe generation prevents scaling #12

Closed jbusecke closed 1 year ago

jbusecke commented 2 years ago

Now that I am more seriously thinking about scaling this recipe up, I wanted to start a conversation about the performance issues I foresee.

Currently we generate a dictionary of recipes in serial via a for loop. Each recipe makes an API query and then generates some keyword arguments that need to be passed to the recipe generation.

If we want to scale this up to thousands or more of recipes we should spend some time understanding which parts of this machinery take the most time and how we could maybe avoid this.

Some preliminary thoughts:

I intend this to be a loose conversation for now, so if anyone has ideas, please feel free to discuss.

jbusecke commented 2 years ago

I think I made some significant progress in #13, which is now able to scale up to ~100 recipes in a reasonable time.