canceling a task while it is running can delete an added task

philj404 commented 1 year ago

I would like to use timer tasks for my finite state machines, but I have been having trouble with them. Basically I want to ensure that at most ONE state machine task is in the timer at any stage. This helps ensure that the number of tasks does not increase uncontrollably if I queue up a new task (for example for an emergency stop). So, what I was planning on doing is cancel the pending task, then add the new task. This works ok EXCEPT when the two timer.cancel()...timer.in() calls are made from within the running task.

Here is my example (built for Arduino Nano -- processor should not matter):

#include <arduino-timer.h>

auto timer = timer_create_default();

static const int WAIT_POLO = 2022;
static const int WAIT_MARCO = 4000;

Timer<>::Task activeMarcoPoloTask = 0;  // NULL

///////////////////////////////////////////////////////////////////////////////
// reschedule a new task in place of the old one
// at most ONE task is scheduled, regardless of where it happens
//
void rescheduleTask(int nextWait, Timer<>::handler_t nextAction) {

#define SHOW_CANCEL_BUG
#ifdef SHOW_CANCEL_BUG
  timer.cancel(activeMarcoPoloTask);
#endif
  activeMarcoPoloTask = timer.in(nextWait, nextAction);
}

///////////////////////////////////////////////////////////////////////////////
bool doMarco(void *) {
  Serial.print("Marco ");
  rescheduleTask(WAIT_POLO, doPolo);
  return false;
}

///////////////////////////////////////////////////////////////////////////////
bool doPolo(void *) {
  Serial.println("Polo");
  rescheduleTask(WAIT_MARCO, doMarco);
  return false;
}

///////////////////////////////////////////////////////////////////////////////
void setup() {
  Serial.begin(115200);
  delay(2000);
  Serial.println(F("Running " __FILE__ ",\nBuilt " __DATE__));

  activeMarcoPoloTask = timer.in(1000, doMarco);
}

///////////////////////////////////////////////////////////////////////////////
void loop() {
  timer.tick();
}

The phrase "Marco Polo" does not repeat when SHOW_CANCEL_BUG is defined -- output freezes as there are no tasks in the timer.

I THINK the problem is that canceling the active task (while it is running) frees up its slot in the timer but the timer does not realize the free slot can now refer to a new task. When a new task uses that slot, the timer cleanup code from the previous task deletes it,

Here's the code for tick():

    template <typename R> void
    tick()
    {
        timer_foreach_task(task) {
            if (task->handler) {
                const unsigned long t = time_func();
                const unsigned long duration = t - task->start;

                if (duration >= task->expires) {
                    task->repeat = task->handler(task->opaque) && task->repeat; // ! THIS MAY BE REPLACED WITH A NEW TASK!

                    if (task->repeat) task->start = t;
                    else remove(task);
                }
            }
        }
    }

So... is this a bug?

Should I submit a pull request with a bugfix and new regression test to confirm it works?

philj404 commented 1 year ago

Here is a proposed fix for Timer<>::tick():

template<typename R> void
  tick() {
    timer_foreach_task(task) {
      if (task->handler) {
        const unsigned long t = time_func();
        const unsigned long duration = t - task->start;

        if (duration >= task->expires) {
          Task activeId = task_id(task);
          bool repeating = task->repeat;

          repeating &= task->handler(task->opaque);

          if (activeId == task_id(task)) {    // the task still exists?
            if (repeating) task->start = t;
            else remove(task);
          }
        }
      }
    }
  }

contrem commented 1 year ago

This comes back to https://github.com/contrem/arduino-timer/discussions/47 - you can't cancel a task from within itself.

In your example you have:

activeMarcoPoloTask = doMarco
in doMarco
-> rescheduleTask
     -> cancel doMarco // BUG - canceling doMarco from within doMarco
     -> add polo - // overwrites doMarco slot with polo, which then gets remove()'d after doMarco completion

The timer array is fixed length. It uses the same memory whether it's full or empty, so you won't get uncontrollable growth if something happens and you start adding tons of tasks. The addition of the new tasks will fail.

In your example you'd have at most 2 active slots, because you only add one task per function, and there are no repeat tasks. Timer as is always needs n + 1 active slots where n is the number of added tasks while one task is running. It has to keep the active slot alive for repeat tasks, or it might fail to repeat it due to lack of space.

The removal logic could be improved in order to gain that one slot, not a huge gain. It could also be improved to handle task cancels from within itself, bigger gain to prevent bugs like these, but probably not a trivial change.

skrobinson commented 1 year ago

My question is "why reschedule there?" I tend to think of Timer<>::handler_t as needing to finish quickly (read a value, set a value, etc.).

What about global nextWait and nextAction (or as attributes of a global object) that are set by do... tasks and used to make changes after timer.tick() in loop()?

Timer<>::handler_t *newMarcoPoloTask = 0;  // NULL
int nextWait;

void loop() {
  timer.tick();
  if (newMarcoPoloTask) {
    timer.cancel(activeMarcoPoloTask);
    activeMarcoPoloTask = timer.in(nextWait, &newMarcoPoloTask);
    newMarcoPoloTask = 0;  // reset flag
  };
}

philj404 commented 1 year ago

@contrem Right. I suspect this is actually an "enhancement" request. It is not a bug as it has been stated "you can't do it"... but I want to do it. So I should make it safe to do so.

The timer is pretty well behaved as it is -- so long as there are slots available everything works, and behavior is well-defined when the timer runs out of slots.

@skrobinson While I could add code elsewhere to modify the task consistently (for example in loop()) , it clutters loop() and the code intent ends up farther away from the rest of the code involved. Ideally I would only want loop() to only call timer.tick();.

Basically when I'm updating the next-state and time-to-run, I don't want to worry about which task is making the change. I want the code to look the same, and after it runs I still only want one timer task running the state machine.

For a contrived example I may want to set the next state to EMERGENCY_STOP. There are two ways to get to that state -- temperature is too high (discovered by the scheduled task) OR the STOP button was pressed manually (a button monitor task). I don't want to worry about which task wants to schedule the EMERGENCY_STOP. A year from now I will not remember why the code has to look different for each case.

I will set up a pull request with my suggested change (and a regression test). It's a little bit easier to understand what I have in mind. If the suggestion does not fit with how arduino-timer should really be used -- for example side effects make other changes harder -- we can reject it.

philj404 commented 1 year ago

See #73 for what I think can work.

contrem / arduino-timer

canceling a task while it is running can delete an added task #71