ChronixDB / chronix.server

The Chronix Server implementation that is based on Apache Solr.
Apache License 2.0
263 stars 29 forks source link

MovingAverage implementation is wrong #82

Closed TPolzer closed 7 years ago

TPolzer commented 7 years ago

The documentation says:

/** 
     * Calculates the moving average of the time series using the following algorithm:
     * <p>
     * 1) Get all points within the defined time window
     * -> Calculate the time and value average sum(values)/#points
     * <p>
     * 2) If the distance of two timestamps (i, j) is larger than the time window
     * -> Ignore the emtpy window
     * -> Use the j as start of the window and continue with step 1)
     *
     * @param timeSeries the time series that is transformed
     * @return the transformed time series
     */

That is not a moving average (and it's jumpy).

FlorianLautenschlager commented 7 years ago

Well that is not a typical moving average that uses a fixed amount of samples. Instead it uses a time frame/window that slides over the time series and hence the amount of samples within a time frame varies. I agree that we should clearly point that out and we should enrich the the documentation. Furthermore we should provide a simple moving average that uses a fixed (user-defined) amount of samples.

TPolzer commented 7 years ago

It doesn't slide, it jumps. So for example, for 1000 (evenly spaced) standard normal distributed points and a window size of 80, this results in a data sequence like this: "smoothed" points Why would I want that?

FlorianLautenschlager commented 7 years ago

I looked into the code and you are right. Thanks for reporting this bug. Just implemented a corrected version.