ESMValGroup / ESMValCore

ESMValCore: A community tool for pre-processing data from Earth system models in CMIP and running analysis scripts.
https://www.esmvaltool.org
Apache License 2.0
42 stars 38 forks source link

Run preprocessor for multiple regions #1267

Closed Peter9192 closed 3 years ago

Peter9192 commented 3 years ago

I know at least four use cases* where it would be useful to be able to run the same preprocessing chain for multiple regions. Of course I could run the recipe multiple times, but that feels rather convoluted and moreover, I would like to be able to combine the results in the diagnostic script. Therefore it would be really nice if I could do something like

preprocessor:
  analyse_regions:
    extract_shape:
      shapefile: shapefiles/ar6.shp
      decomposed: True
      method: contains
      crop: true
      ids:
        - 'N.Europe'
        - 'West&Central-Europe'
        - 'E.Europe'
        - 'Mediterranean'
      area_statistics:
        operator: mean

and get the mean field for each of the specified regions. In this case, it seems similar to seasonal_statistics. Perhaps leaving out the area_statistics should also be an option, but I'm not quite sure how that should behave. I guess it could produce four preprocessed cubes, one for each region, or a single cube with an additional (categorical) dimension.

I guess it could work something like the existing iris coord categorization, although it would have to be applied to lat and lon simulatenously. Perhaps it can be added as an optional (groupby-like) argument to area_statistics, although then it is tied to that specific preprocessor, while perhaps aggregation is not always desirable.

I'm also not sure if it should work exclusively for shapefiles with multiple IDs, or that regions could be supplied as multiple shapefiles or several bounding boxes (like extract_region).

Any ideas, suggestions, comments, or objections as to whether this is useful and how this could best be achieved?

*) the four usecases:

  1. Impact recipe, provides quick insight into model performance for selected variables. User request is to give this information more specifically for multiple regions.
  2. ClimWIP recipe, calculating weights/constraints for multiple regions.
  3. Hydrology recipes, generating forcing data for multiple catchment areas
  4. @thomascrocker's work on EUCP lines of evidence
Peter9192 commented 3 years ago

Aha, it seems that my MWE actually works (it crashed before because it was quite inefficient, not because it wasn't valid). Still won't work for extract_region then, but this is good enough for me as of now.