Open GoogleCodeExporter opened 9 years ago
One approach is to keep ordered list of values stored in the cyclic buffer.
Maximum
and minimum are last and first entries on the list.
The approach doubles the required RAM and in case of large sets (multi-hour
statistics) degraded performance of the Add()
Original comment by larytet@gmail.com
on 15 Oct 2009 at 3:55
The approach was used in the class IntMaxMin
How to compute rolling standard deviation ?
Original comment by larytet@gmail.com
on 15 Oct 2009 at 3:10
As for min / max - if we want avoiding re-computing, keep its value somewhere
(define
new member (property?) holding the value) and compare the added item to the min
/ max
stored there, update it if necessary.
Same approach for the moving average - NewMean = OldMean + (NewItem - OldItem) /
CountOfItems . Note - this is for constant CountOfItems. As I understand it's
already
implemented in IntStatistic, int summ ?
Std. Dev. - I see no other way but re-compute SD when a new value is added (of
course
if it's different from the one removed). Consider the SD formula:
SD = Sum((X[i] - Mean)^2) / (Count - 1)
In order to update you need first update the Mean, then run in loop over all the
items and sum the squares. Am I right?
BTW, for linear stats - mean, max, min - it's unnecessary to keep the whole
list at
all (no need in cyclic buffer), just computed value (and, in case of mean - the
length of the list)
Original comment by jerusale...@gmail.com
on 16 Oct 2009 at 9:08
I keep current Max and Min values and every new value in the buffer is compared
against the two.
The problem arises if I want Max and Min values over last 10 minutes, over last
1
hours and over last 5 days. This is where cyclic buffer comes into play.
Cyclic buffer keeps only N latest entries. When entry N+1 is added the oldest
entry
goes out. Now, let's say that the oldest entry was equal to Max. What should I
do ?
In the current implementation I keep ordered list of all entries in the cyclic
buffer. When an entry gets dropped from the cyclic buffer i remove the entry
from the
ordered list. First and last entries of the list contain min and max. The trade
of is
performance of the Add() operation - i have to keep the ordered list.
Regarding Std Dev I think about something like "Welford's Algorithm for single
pass
calculation"
The goal is the same - on the fly calculation of SD. When a new entry is added
to the
buffer and the oldest entry is removed I want to update SD without getting over
all
entries in the buffer. I want something better than O(n).
Consider, for example, that you want to keep 10min SD, 1 hour SD , 1 day SD and
1
week SD. Entries (numbers) are getting added all the time. What do you do ?
Original comment by larytet@gmail.com
on 17 Oct 2009 at 7:50
Original issue reported on code.google.com by
larytet@gmail.com
on 15 Oct 2009 at 3:50