sopnic / larytet-master

Automatically exported from code.google.com/p/larytet-master
0 stars 0 forks source link

Moving max and min in IntStatistics #13

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
Easy and fast way to implement moving max and min ?

Original issue reported on code.google.com by larytet@gmail.com on 15 Oct 2009 at 3:50

GoogleCodeExporter commented 8 years ago
One approach is to keep ordered list of values stored in the cyclic buffer. 
Maximum
and minimum are last and first entries on the list.
The approach doubles the required RAM and in case of large sets (multi-hour
statistics) degraded performance of the Add()

Original comment by larytet@gmail.com on 15 Oct 2009 at 3:55

GoogleCodeExporter commented 8 years ago
The approach was used in the class IntMaxMin

How to compute rolling standard deviation ?

Original comment by larytet@gmail.com on 15 Oct 2009 at 3:10

GoogleCodeExporter commented 8 years ago
As for min / max - if we want avoiding re-computing, keep its value somewhere 
(define
new member (property?) holding the value) and compare the added item to the min 
/ max
stored there, update it if necessary.  

Same approach for the moving average - NewMean = OldMean + (NewItem - OldItem) /
CountOfItems . Note - this is for constant CountOfItems. As I understand it's 
already
implemented in IntStatistic, int summ ?

Std. Dev. - I see no other way but re-compute SD when a new value is added (of 
course
if it's different from the one removed). Consider the SD formula:

SD = Sum((X[i] - Mean)^2) / (Count - 1)

In order to update you need first update the Mean, then run in loop over all the
items and sum the squares. Am I right?

BTW, for linear stats - mean, max, min - it's unnecessary to keep the whole 
list at
all (no need in cyclic buffer), just computed value (and, in case of mean - the
length of the list)

Original comment by jerusale...@gmail.com on 16 Oct 2009 at 9:08

GoogleCodeExporter commented 8 years ago
I keep current Max and Min values and every new value in the buffer is compared
against the two. 

The problem arises if I want Max and Min values over last 10 minutes, over last 
1
hours and over last 5 days. This is where cyclic buffer comes into play. 

Cyclic buffer keeps only N latest entries. When entry N+1 is added the oldest 
entry
goes out. Now, let's say that the oldest entry was equal to Max. What should I 
do ?
In the current implementation I keep ordered list of all entries in the cyclic
buffer. When an entry gets dropped from the cyclic buffer i remove the entry 
from the
ordered list. First and last entries of the list contain min and max. The trade 
of is
performance of the Add() operation - i have to keep the ordered list.

Regarding Std Dev I think about something like "Welford's Algorithm for single 
pass
calculation"
The goal is the same - on the fly calculation of SD. When a new entry is added 
to the
buffer and the oldest entry is removed I want to update SD without getting over 
all
entries in the buffer. I want something better than O(n).
Consider, for example, that you want to keep 10min SD, 1 hour SD , 1 day SD and 
1
week SD. Entries (numbers) are getting added all the time. What do you do ?

Original comment by larytet@gmail.com on 17 Oct 2009 at 7:50