Improve resampling options

What's needed?

Users could get more control about resampling:

aggregation: how to aggregate the data points in the interval, e.g. mean, sum, min/max, interpolation (upsampling)
closed: left or right closed interval, i.e. ts1 <= ts < ts2 or ts1 < ts <= ts2
label: which timestamp is assigned to the resampled interval. Possible options:
- Fixed-interval according to resampling bins, e.g. start/end of resampling bin (corresponds to oldest/newest possible timestamp), or the center of the resampling bin,
- Derived from the data, e.g. the oldest/newest/average of the timestamps that were aggregated in each resampling bin.
resolution (update): the resolution parameter currently does not support resampling periods smaller than 1s.

The label of the first version of the API defaults to fixed-interval using the start of the resampling bin as timestamp.

Proposed solution

Support corresponding parameters in ResamplingOptions.

Use cases

Different aggregations make sense if metrics like energy (sum) or peak values (min/max) are of interest.

Different closed options could be helpful if data is compared with external data that could use another interval definition (e.g. DSO, clients).

label is required if the closed option is changed to avoid weird timestamps.

Resolutions below 1s could be interesting if faster reaction is needed or very short-term forecasts. Since we plan to make resampling mandatory when aggregating components, the shortest resolution would be 1s for component aggregations.

Alternatives and workarounds

No response

Additional context

Related to:

Example for resampling options in pandas: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.resample.html

Moving post v1.0:

aggregation: how to aggregate the data points in the interval, e.g. mean, sum, min/max, interpolation (upsampling)

With the current ETL the aggregation is pre-determined, changing the aggregation method would only work on raw data and can be deferred to the client or the user.

closed: left or right closed interval, i.e. ts1 <= ts < ts2 or ts1 < ts <= ts2

Don't think that closed has high practical implications.

label: which timestamp is assigned to the resampled interval. Possible options:

That's easy to fix by the user or on client level.

resolution (update): the resolution parameter currently does not support resampling periods smaller than 1s.

At least for our current data streams with a handful of samples per second this is of minor importance.

frequenz-floss / frequenz-api-reporting