ESMValGroup / ESMValCore

ESMValCore: A community tool for pre-processing data from Earth system models in CMIP and running analysis scripts.
https://www.esmvaltool.org
Apache License 2.0
42 stars 38 forks source link

Optimize functions `mask_landsea()`, `mask_landseaice()` and `calculate_volume()` for lazy input #2515

Closed schlunma closed 1 week ago

schlunma commented 3 weeks ago

Description

This PR

Closes #2514


Before you get started

Checklist

It is the responsibility of the author to make sure the pull request is ready to review. The icons indicate whether the item will be subject to the ๐Ÿ›  Technical or ๐Ÿงช Scientific review.


To help with the number pull requests:

codecov[bot] commented 3 weeks ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 94.74%. Comparing base (343368e) to head (325db6d). Report is 2 commits behind head on main.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #2515 +/- ## ========================================== + Coverage 94.70% 94.74% +0.04% ========================================== Files 248 248 Lines 14002 14012 +10 ========================================== + Hits 13260 13276 +16 + Misses 742 736 -6 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

schlunma commented 2 weeks ago

The question here is how to treat a mix of input arrays. The common choice seems to be that the output is lazy if any of the input arrays are lazy, e.g. np.ones(2) * da.ones(2) results in a Dask array. This kind of makes sense because it provides the lowest memory footprint and still offers the user full control because they can make all input arrays realized if they want realized output. Would that work for you?

Yes, that makes total sense, and also has the benefit of easier code. I chose to make it only depend on the actual data, not on the ancillary variables. I will change that now ๐Ÿ‘

schlunma commented 2 weeks ago

Thanks for the review Bouwe! I implemented all your suggestinos (also for the masking preprocessor here). Could you please have another look?