[x] This PR addresses an already opened issue (for bug fixes / features)
This PR fixes #80 and fixes #93.
[x] (If applicable) Documentation has been added / updated (for bug fixes / features).
[x] (If applicable) Tests have been added.
[x] This PR does not seem to break the templates.
[x] HISTORY.rst has been updated (with summary of main changes).
[x] Link to issue (:issue:number) and pull request (:pull:number) has been added.
What kind of change does this PR introduce?
Extends resample to explicitly cover the case where the initial_frequency is coarser than "W", meaning that it has a non-uniform length.
Weights are the proportion of days of the total period contributed by each timesteps. Ex: MS to YS => weight = daysinmonth / 365 .
"weighted resample" is activated for methods : "wind_direction", "mean", "std", "var" and "median". Those are the methods from DataArray.resample for which a weighted version makes sense.
Implemented a generic version that uses resample(...).map(....weighted().op()), with a special case for "median" since that's not a weighted op. Mapping a function is less performant, especially since flox can't be used.
Implemented a shortcut for method='mean' that simplifies to a simple resample().sum(), meaning it should be much more performant and able to use flox. Without dask and on the air dataset, I had a 3x speedup.
Extends resample to handle missing values with new missing arg:
mask : Mask incomplete periods (as in missing timesteps)
drop: Drop incomplete periods (same)
dict with method=<xclim missing method> : Mask periods failing specified xclim missing check.
Pull Request Checklist:
number
) and pull request (:pull:number
) has been added.What kind of change does this PR introduce?
Extends
resample
to explicitly cover the case where theinitial_frequency
is coarser than "W", meaning that it has a non-uniform length.DataArray.resample
for which a weighted version makes sense.resample(...).map(....weighted().op())
, with a special case for "median" since that's not aweighted
op. Mapping a function is less performant, especially since flox can't be used.method='mean'
that simplifies to a simpleresample().sum()
, meaning it should be much more performant and able to useflox
. Without dask and on the air dataset, I had a 3x speedup.Extends
resample
to handle missing values with newmissing
arg:mask
: Mask incomplete periods (as in missing timesteps)drop
: Drop incomplete periods (same)method=<xclim missing method>
: Mask periods failing specified xclim missing check.Does this PR introduce a breaking change?
No.
Other information: