RobTillaart / RunningAverage

Arduino library to calculate the running average by means of a circular buffer.
MIT License
53 stars 10 forks source link

Would you consider adding function to "RunningAverage" #30

Closed heidnerd closed 3 months ago

heidnerd commented 3 months ago

I'm working with sensors - most of the time the interval between sensing periods is very close such that a simple running average for the hour, 8 hour or day works fine. However I've also been working to make the sensing packages reduce energy if the battery appears to be dropping - and that typically means stretching out time between samples.

This means the data needs to be time weighted averages (TWA).

Easy way is to use the RunningAverage liibrary. In the application use TWO circular buffers, the first will be the numerator in calculating TWA, the second would contain the values for the denominator.

TWA =  Sum_of (delta_T X sensor_value) / (Sum_of (delta_T))

By using the existing library, it's now possible to use your existing functions to find the jitter in the interval between measurements. That jitter or variation can mean a lot in the accuracy...

Your existing library code with new function like:

// 
float RunningAverage::getsum() const
{
  if (_count == 0)
  {
    return NAN;
  }

  return _sum;   // 
}

(added syntax highlighting)

RobTillaart commented 3 months ago

Thanks for sharing, Sounds like an interesting addition. And I need to think about how it affects existing projects. A derived class might be the right choice.

RobTillaart commented 3 months ago

@heidnerd Could it be interesting to make the weight more generic? Default weight is 1, your proposal the weight is time, however it could also be an other unit.

This would imply that addValue(value, weight = 1) would allow the user to add any weight it wants to,

It would also solve a "time problem" for the first sample, the library cannot add a weight to it, so it should be zero.?

RobTillaart commented 3 months ago

@heidnerd Created a first implementation of a RA_weight class in the develop branch. It is derived from RunningAverage and implements the core functions to add a value and to get average.

Please give it a try to see if it meets your needs.

Note however that not all functionality is implemented with weights in mind yet. It needs investigation if the remaining function make sense with weights and how to implement them efficiently. The outcome might be that a stand alone repo with a stripped interface is more efficient in the end.

(updated)

RobTillaart commented 3 months ago

Note: https://en.wikipedia.org/wiki/Weighted_arithmetic_mean

RobTillaart commented 3 months ago

@heidnerd Note: I kept the implementation in the RA_weight class pretty generic so people have to add the weight themselves with addValue(value, weight). If none is given the weight == 1.

This allows you to add the duration as weight and other people using theirs.

Please let me know if there is functionality missing that would logically fit into the RA_weight class.

heidnerd commented 3 months ago

I can come up with some examples. Pretty simple examples of the TWA and the interval jitter is useful.

I can add in the section of code and see how that works - but I really don't like forking libraries unnecessarily.


Dennis Heidner

(206) 817-1164 On 2024-06-24 20:44, Rob Tillaart wrote:

Thanks for sharing, Sounds like an interesting addition. And I need to think about how it affects existing projects. A derived class might be tight choice.

-- Reply to this email directly, view it on GitHub [1], or unsubscribe [2]. You are receiving this because you authored the thread.Message ID: @.***>

Links:

[1] https://github.com/RobTillaart/RunningAverage/issues/30#issuecomment-2187911479 [2] https://github.com/notifications/unsubscribe-auth/ALN7EVW47KW2SX4WNECYTEDZJDRRBAVCNFSM6AAAAABJ23FFNGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOBXHEYTCNBXHE

RobTillaart commented 3 months ago

I can come up with some examples. Pretty simple examples of the TWA and the interval jitter is useful.

That would be very informative as keeping the API close to real use cases is good (imho).


I can add in the section of code and see how that works

Your feedback is welcome


but I really don't like forking libraries unnecessarily.

Agree, however in my experience the Arduino (IOT) ecosystem is very diverse. I have seen projects working with tiny processors with 2K Ram or less running on 1 MHz and on the other end things like ESP32's with plenty room to spare at 240 MHz and beyond. For me in my role as (Arduino) library developer I must guard that performance and footprint is more or less stable over versions, especially for often used libraries like RA. Furthermore the interface must be kept simple to not confront new tinkers with steep learning curves (adding weights is not the problem here).

Adding weights to the base RunningAverage library roughly doubles the RAM footprint, which is limited at the low end of the IOT spectrum. Therefor I have to be careful and prefer a separate derived class. Then users can select the class with or the class without weights. Only if a derived class is harder (whatever that means) than a new library the latter comes into sight.

Hope this helps to understand my thoughts around libraries in general and this issue specific.

RobTillaart commented 3 months ago

Spend a few hours trying to understand the definition of STDDEV and STDERR in context of a weighted running average. no conclusion (code) yet

heidnerd commented 3 months ago

I'm still chewing on all the stuff you sent.

Thinking a little different.

For each instance of

RunningAverage raMinute(60); RunningAverage raHour(60);

Two arrays circular arrays are created. One for the minutes and one for hours.

I'm thinking something like:

RunningAverage raMinute(60);
RunningAverage raHour(60);

RunningAverage raMinute_Sensor_TimeProduct(60);
RunningAverage raHour_Sensor_TimeProduct(60);

RunningAverage raMinute_Intervals(60);
RunningAverage raHour_Intervals(60);

Then 
// grab ms time

more code in here

//calculate elapsed time here someplace

  raMinute.addValue(rn);
  raMinute_Sensor_TimeProduct.addValue( (rn * elapsedtime) );
  raMinute_Intervals.addValue ( elapsedtime ) ;
  samples++;

  if (samples % 60 == 0) {
  raHour.addValue(raMinute.getAverage());
  raHour_Sensor_TimeProduct.addValue ( raHour_Sensor_TimeProduct.getAverage() );
  raHour_Intervals_.addValue ( raHour_Intervals.getAverage() );

// then for time weighted.
//   something like

time_weighted_minute_average = ( raMinute_Sensor_TimeProduct.getSum() / raMinute_Interval.getSum() );

The hour variation might look like 

time_weighted_hour_average = ( raHour_Sensor_TimeProduct.getSum() / raHour_Interval.getSum() );

//The average time interval between minute & hour samples would be:

Serial.print(raMinute_Interval.getAverage(), 4);
Serial.print(raHour_Interval.getAverage(), 4);

By using a simple getSum most of the library is left alone. (getSum was proposed in the original post above).

All the other functionality of the RunningAverage remains as before. It would still be possible to obtain the std. deviation from the raMinute, and raHour [((( the simple circular buffer )))] and we can get the sensor time jitter from the interval array. We take advantage of being to get to the internal "sum" to help perform the TWA external to the library.

The growth size of the arrays is really a function of the software designer function. Most the applications that the users write using TWA would not be on the smallest Arduino UNO, but more likely on a RP2040 or ESP32 with more memory.

In the past most of the TWA calculations would be done on a larger system once the data has been collected - fed into a database then examined. But what I'm trying to do is take advantage of the large RAM available to the ESP and RP2040 and make it possible to better see what the sensor data shows.

TWA for sensors is frequently used with "criteria gases" which are NOx, CO, H2S, SOx, etc - stuff that is harmful and have maximum limits and daily or workshift limits.

Using the three circular buffers and the libraries you've already provided means I can easily detect mins, the variation, TWA and display.

I'm hoping that it can be done with as little work on your part. Sorry if I've side tracked you.

heidnerd commented 3 months ago

For testing - a delay which might be something like additional call to random number then -- delay (random(1,250); OR simply use the value of the second call to the random number function and then addValue (random(250) );

RobTillaart commented 3 months ago

So you only need the function getSum() ?

I was in the assumption you needed something far more different. Still this investigation has learned me a few new things and even more new question ;) So definitely worth the time spend.


I'll park the effort in a draft repo as RunAvgWeight in my TODO list.

RobTillaart commented 3 months ago

@heidnerd Redid the whole bunch ==> see develop branch + PR.

I have added a number of checks if the internal array was allocated for robustness. This changed a number of return types, is backwards compatible.

Please verify if develop branch meet your needs.

heidnerd commented 3 months ago

Correct only getSum, as simple as possible. Yet that export makes it possible to create TWA by using two or more instances...

I'll see about adding some links on how TWA is used with sensors to help estimate exposures and health risks.

RobTillaart commented 3 months ago

Correct only getSum, as simple as possible. Yet that export makes it possible to create TWA by using two or more instances...

I'll see about adding some links on how TWA is used with sensors to help estimate exposures and health risks.

Then I will merge the PR asap, there is an additional new function - see #31 - and some improved signatures. Furthermore I will close this issue. Feel free to open a new issue if needed.

heidnerd commented 3 months ago

I saw #31 while I don't need it now, it will be useful.

Thanks, I'll at some links and explanation. I can also try doing a simple example sketch.

RobTillaart commented 3 months ago

getSum() is now in version 0.4.6, released a few minutes ago.

Again thanks for the issue, the "derived class weighted running average" is on my todo list and will appear sometime in the future.