contrem / arduino-timer

Non-blocking library for delaying function calls
BSD 3-Clause "New" or "Revised" License
334 stars 50 forks source link

Mixing high frequency tasks with long running tasks disable the high-frequency ones. (in the return value of tick) #15

Closed DavyLandman closed 3 years ago

DavyLandman commented 3 years ago

This is a very nice compact library! Just what I needed.

I ended up splitting my code into 2 timers, because I observed that if I register an task to run every 10ms, and one a slow one to run every 1s, but the slow one might take 200ms, the high frequency task would not get triggered.

At least not when respecting the return value of the tick() function. If I ran the tick() twice after each other, it would go fine. Since the slow one wouldn't trigger, but the fast one would, and it would now return a much smaller delay time.

rough code:

void setup() {
   Serial.begin();
   timer.every(1000, [](void*) -> bool{ delay(200); return true);
   timer.every(10, [](void*) -> bool { blinkLed(); return true});
}

void loop() {
   unsigned long sleepTime = timer.tick();
    if (sleepTime > 10) {
       Serial.print("Sleeping longer than high frequency max: ");
       Serial.println(sleepTime);
    }
   delay(sleepTime);
}

Version: 2.1.0

philj404 commented 3 years ago

This behavior is what I would expect... but there might be some workarounds which could help you.

DavyLandman commented 3 years ago

Thanks for your reply.

  • Try to move EVERYTHING into a timer task. Delays (and useful code!) in loop() will always slow all timer tasks. The tasks will not run until loop() calls timer.tick(). If you really need some code to run every loop you can use timer.every(0, doItEveryLoop).
  • Ideally loop() only has timer.tick() in it.

I've moved everything into timer tasks, it's a nice abstraction where your library helped really nicely 👍

  • If you can, break your large task in to smaller parts. Part1 can start part2 with timer.at(0, part2). Part2 can start part3, etc.. Be careful to ensure Part1 does not restart until all the other parts have finished. I forgot to do this once -- I was very confused for a while.

The larger task is an MQTT message that is send, and as far as I know, there is no good async TCP library on esp8266, that would have indeed been nicer.

  • If your high-frequency task has a hard deadline, you may want to use timer interrupts. For example if you don't want your scrolling LED display to stutter while logging data, a timer interrupt task could help.

  • The arduino-timer library is cooperative -- it waits for a task to finish before searching for another task to run. It does not use any extra hardware resources or get in the way of a hardware timer interrupt task. You should be able to use both arduino-timer and hardware timer interrupts at the same time.

So I understand it's all a software timer, and that is fine enough for my purpose, it's not safety critical, and missing a few frames of the high frequency task sometimes is okay.

This behavior is what I would expect... but there might be some workarounds which could help you.

But if I schedule like an MQTT update every 10 seconds, and a sensor update every 1 second, and the MQTT update took 1.2s I would be okay with missing the 0.2 delayed tick, but less by missing the next 8 ticks of the sensor update. That I do not really understand?

So the device needs a bit of sleep/delay to save battery&heat, so that is why I intended to use the result of the tick call. Am I using it wrong in the example?

philj404 commented 3 years ago

FYI You need to thank @contrem for this nice library. I'm just a happy user.

Inspecting the code it appears that you will get most of those later sensor updates, but they will all be 0.2 seconds later than originally expected. Then with the next MQTT update, subsequent sensor updates will be delayed a further 0.2 seconds. This could cause problems with plotting data over long time intervals (or even knowing how long "5000 measurements" took).

I suspect the MQTT update communication makes a handshake/acknowledgement with other computers, so I could see that any retries might make the actual delay inconsistent.

A workaround for THAT might be reporting the time of measurements along with the value (as a y-t pair or a tuple of multiple readings). Your readings would be correct... even if you can't guarantee X samples over one minute. The uneven time intervals make plotting more complicated (harder to interpret trends correctly).

I had not thought of using the delay() in loop() like that, to minimize power. I think it might work, as long as your version delay() goes to a low-power mode. (I am using the timer with robot motor controllers. Minimizing CPU power doesn't make much difference to my application.)

... (Thinking a bit more) You may be able to use a timer interrupt (interrupt service routine or ISR) to load a queue with sensor data. This could have (nearly) consistent timing.

Then periodically your MQTT task disables interrupts, flushes all queued readings to an output buffer, and re-enables interrupts. This frees up the queue for regular interrupt processing. When your MQTT task runs it unloads the buffer, so it can take as much time as it needs.

So long as the MQTT buffer can save and forward all the recent sensor readings you might get what you need. I hope (I haven't tried this).

DavyLandman commented 3 years ago

Sorry, I was confused by your answer, should have checked username. Would love to hear back from @contrem when he has some time.

kpfleming commented 3 years ago

This may be related...

If you are using the output of tick() to delay until the next timer is going to expire, you can run into this sort of problem. In my fork of this library I've changed the code a bit, so that the 'next expiration' time is not computed until after all timers which have expired have been executed (instead of computing the 'next expiration' for each timer as it is handled). The net difference is that if a timer which has not yet expired when it is processed expires during the processing of another timer in the same tick() call, it will cause tick() to return a 'next expiration' of zero, so that no sleep will happen at all.

The difference is here ->

original: https://github.com/contrem/arduino-timer/blob/721bfa237883126f392ccaf45d0bc6d75063cf05/src/arduino-timer.h#L117

modified: https://github.com/kpfleming/arduino-timer-cpp17/blob/d3fae08ec1ecb17752c105abfd8b3abbb1b0f651/src/arduino-timer-cpp17.hpp#L322

Unfortunately my library is not a drop-in replacement for contrem's library, but a similar change could be made here.

contrem commented 3 years ago

Thanks for reporting this issue.

After a quick look I believe I've found the cause. The ticks remaining need to be calculated for repeating tasks after they're run.

Without this, for the following:

setup() {
  timer.every(1000UL, ...);
  timer.every(100000UL, ...);
}

loop() {
  timer.tick();
}

Simulation gives:

tick: 900 tick: 700 tick: 500 tick: 300 tick: 100 tick: 98700 tick: 600 tick: 400 tick: 200 tick: 97700 

But the expected output should be:

tick: 900 tick: 700 tick: 500 tick: 300 tick: 100 tick: 1000 tick: 600 tick: 400 tick: 200 tick: 1000

The following patch should fix the issue:

diff --git a/src/arduino-timer.h b/src/arduino-timer.h
index 48a56d0..26a4463 100644
--- a/src/arduino-timer.h
+++ b/src/arduino-timer.h
@@ -112,8 +112,10 @@ class Timer {
                 if (duration >= task->expires) {
                     task->repeat = task->handler(task->opaque) && task->repeat;

-                    if (task->repeat) task->start = t;
-                    else remove(task);
+                    if (task->repeat) {
+                        task->start = t;
+                        ticks = task->expires < ticks ? task->expires : ticks;
+                    } else remove(task);
                 } else {
                     const unsigned long remaining = task->expires - duration;
                     ticks = remaining < ticks ? remaining : ticks;

When I get some time I'll test it out on real hardware with the same values and delay() calls as described in the first post.

In the meantime, @DavyLandman you could try applying this patch and see if it fixes your issues.

DavyLandman commented 3 years ago

@contrem yes, great, that fixed it 👍

philj404 commented 3 years ago

After the large/long task completes, should the next tick delay go to zero? The frequent/short task may be running late by then, especially if it's the first task.

Perhaps tick() should return 0 unless NO tasks ran, so we know the current time (our idea of "now") didn't change much while we checked all tasks.

kpfleming commented 3 years ago

I'm not sure that will be sufficient to solve the problem. Let's use an example of two timers.

Timer 1 is scheduled to run every 1000 millis (1 second), and takes 100 millis to execute its handler.

Timer 2 is scheduled to run every 10000 millis (10 seconds), and takes 800 millis to execute its handler.

Let's also assume that when tick() is called, both timers have expired and are due to execute.

Timer 1 will be executed, and when 'ticks' is calculated it will be 1000 (the time for the timer to expire again).

Timer 2 will then be executed on the next iteration of the loop, and when 'ticks' is calculated it will stay 1000, because the time for Timer 2 to expire is 10000. 'ticks' will then be returned so that the caller can call delay() to wait.

However, this is wrong, because 800 of Timer 1's 1000 millis-to-expire elapsed while Timer 2 was executing its handler, and the return value should be 200, not 1000. Delaying 1000 millis will cause Timer 1 to be 800 millis late.

This is why I changed the logic when I wrote my version: first, loop through all of the timers and execute any that have expired the moment that you inspect them. Second, loop through all the timers and compute the time-to-expire from that moment (not from the moment that each timer's handler finished executing), and return the lowest time-to-expire of all of them for delay() to use.

This is only an issue when the user uses the result of tick() to call delay(); if they just always call tick() every time loop() runs this problem will never happen.

DavyLandman commented 3 years ago

I think there are two separate discussions.

  1. I reported a bug where the high frequency timer would be completely disabled after missing 1 trigger. That bug was solved by the patch by @contrem
  2. @kpfleming point is about what to do with the missed "trigger", should it be run immediatly after missing it, or should it be dropped? But that is a different discussion, that might warrant an API extensions.
kpfleming commented 3 years ago

Definitely correct for #1; #2 is not about missing triggers completely, but just delaying too long between calls to tick() and as a result missing triggers.

contrem commented 3 years ago

The following patch should address the issue brought up by @kpfleming by taking into account the time a handler takes to execute.

diff --git a/src/arduino-timer.h b/src/arduino-timer.h
index 48a56d0..f2b936f 100644
--- a/src/arduino-timer.h
+++ b/src/arduino-timer.h
@@ -112,8 +112,16 @@ class Timer {
                 if (duration >= task->expires) {
                     task->repeat = task->handler(task->opaque) && task->repeat;

-                    if (task->repeat) task->start = t;
-                    else remove(task);
+                    if (task->repeat) {
+                        const unsigned long elapsed = time_func() - t;
+                        task->start = t;
+
+                        if (elapsed >= task->expires) ticks = 0;
+                        else {
+                            const unsigned long delta = task->expires - elapsed;
+                            ticks = delta < ticks ? delta : ticks;
+                        }
+                    } else remove(task);
                 } else {
                     const unsigned long remaining = task->expires - duration;
                     ticks = remaining < ticks ? remaining : ticks;
philj404 commented 3 years ago

This change seems to be a significant improvement. I've still seen (slightly) missed deadlines, but my "you missed a deadline" checks may be too stringent.

However my tests have never seen timer.tick() return a value less than 1 -- so a loop delay() would always wait another millisecond. Does this make sense?

Attached is my test sketch for an Arduino Nano. checkTimerDelays.txt

contrem commented 3 years ago

This change seems to be a significant improvement. I've still seen (slightly) missed deadlines, but my "you missed a deadline" checks may be too stringent.

I'll have to look at your testing code before commenting on this.

However my tests have never seen timer.tick() return a value less than 1 -- so a loop delay() would always wait another millisecond. Does this make sense?

This makes sense. By the time tick reaches 1, the next tick (which would be tick 0) is handler execution. With the second patch tick() should only return 0 when there are no tasks pending, or if a task has taken longer than its own period.

The weakness with the current elapsed time calculation is that it doesn't consider total elapsed time of all handlers. So, the ticks returned could be wrong if their total elapsed time is greater than the expiration of the next task. This is why @kpfleming has done it in two loops, and is something I'll look into for the final fix.

contrem commented 3 years ago

This is what i will be testing moving forward:

diff --git a/src/arduino-timer.h b/src/arduino-timer.h
index 48a56d0..df14ddb 100644
--- a/src/arduino-timer.h
+++ b/src/arduino-timer.h
@@ -101,6 +101,7 @@ class Timer {
     tick()
     {
         unsigned long ticks = (unsigned long)-1;
+        unsigned long elapsed, start = time_func();

         for (size_t i = 0; i < max_tasks; ++i) {
             struct task * const task = &tasks[i];
@@ -112,8 +113,10 @@ class Timer {
                 if (duration >= task->expires) {
                     task->repeat = task->handler(task->opaque) && task->repeat;

-                    if (task->repeat) task->start = t;
-                    else remove(task);
+                    if (task->repeat) {
+                        task->start = t;
+                        ticks = task->expires < ticks ? task->expires : ticks;
+                    } else remove(task);
                 } else {
                     const unsigned long remaining = task->expires - duration;
                     ticks = remaining < ticks ? remaining : ticks;
@@ -121,7 +124,12 @@ class Timer {
             }
         }

-        return ticks == (unsigned long)-1 ? 0 : ticks;
+        elapsed = time_func() - start;
+
+        if (elapsed > ticks || ticks == (unsigned long)-1) ticks = 0;
+        else ticks -= elapsed;
+
+        return ticks;
     }

   private:
contrem commented 3 years ago

mc/fix-tick-return contains the current fix for this issue.

kpfleming commented 3 years ago

I was thinking about this technique last night; it's a good idea, and avoids the second loop.

philj404 commented 3 years ago

I just tried mc/fix-tick-return with my test sketch.

Thanks for making this change. It looks better than what I had in mind!

kpfleming commented 3 years ago

I've created a pair of Jupyter notebooks with Python code which simulates the behavior of arduino-timer. The 'one-loop' version should be equivalent to the fix-tick-return branch, and the 'two-loop' version should be equivalent to my implementation.

https://github.com/kpfleming/arduino-timer-simulations

If you open either of the notebooks in the GitHub UI you'll see the results of the last simulation run. If you want to open them interactively, click the 'launch Binder' link in the README file, and you'll be able to adjust the simulation parameters and/or modify the code itself.

If the 'one-loop' version is not a faithful simulation of the fix-tick-return branch please let me know and I'll correct it ASAP (or send a PR and I'll merge it). Note that these simulations use a 16-bit integer instead of a 32-bit integer, to make things easier to read and write, but the behavior is the same. This method also allows you to easily test the behavior during clock wraparound, by just setting startTime to a value near the wraparound point and ensuring that simulationTime is long enough to cause the wraparound to occur.

kpfleming commented 3 years ago

And now I've remembered the original reason I added the second loop, which I failed to propose as a new feature here (sorry about that!).

If the user is using the return value of tick() to delay, and any of their timer handlers modify the timer list in any way, the return value can be wildly off if the modified timer was already processed in the first loop, or if an added timer happened to land in a slot with a lower number than the one the first loop is currently handling.

Using a second loop to compute the return value ensures that the current states of all of the timers, even those which may have been added, removed, or changed, are taken into account. Of course if the user is not using the return value, and just calling tick() repeatedly, then timer modifications inside handlers are not an issue as they'll just be picked up on the next call to tick().

contrem commented 3 years ago

mc/fix-tick-return is now updated with a separate ticks() function, which addresses the issue of modifying a timer within a task.

https://github.com/kpfleming/arduino-timer-simulations

This looks pretty cool. FYI, there are some time rollover tests in the extras/test folder.

Thanks everyone for all the feedback.

contrem commented 3 years ago

Fix merged in a6588e3ead4122e54abf4195465479adb8e41bf7.

TommyC81 commented 3 years ago

Made a quick and dirty proposal to avoid the automatic ticks() processing within the tick() function, see #18.

contrem commented 3 years ago

Closing. Fix is available in version 2.1.1.

@TommyC81 if needed, please open a new issue for #18