ESMValGroup / ESMValCore

ESMValCore: A community tool for pre-processing data from Earth system models in CMIP and running analysis scripts.
https://www.esmvaltool.org
Apache License 2.0
42 stars 38 forks source link

Poor performance of lazy `mask_landsea` preprocessor #2514

Closed schlunma closed 1 week ago

schlunma commented 3 weeks ago

Describe the bug

The preprocessor mask_landsea is performing poorly with high-res data. When running the following recipe with dask distributed, the task execution basically freezes. Removing the masking preprocessor fixes this issue.

Example recipe:

documentation:
  title: High-res evaluation
  description: Simple evaluation of high-res model output.
  authors:
    - schlund_manuel

preprocessors:
  nh_land_mean_annual_cycle:
    mask_landsea:
      mask_out: sea
    extract_region:
      start_latitude: 30.0
      end_latitude: 90.0
      start_longitude: 0.0
      end_longitude: 360.0
    area_statistics:
      operator: mean
    monthly_statistics:
      operator: mean
    climate_statistics:
      operator: mean
      period: month

diagnostics:
  annual_cycle:
    variables:
      tas:
        mip: day
        project: CMIP6
        exp: highresSST-present
        preprocessor: nh_land_mean_annual_cycle
        timerange: 1950/1960
    scripts: null

datasets:
  - {dataset: NICAM16-9S, ensemble: r1i1p1f1, grid: gr}

I could trace that to the usage of da instead np here. In addition, the broadcasting actually never happens since the function's return value is not used.

I will provide a fix for that.