Closed davetcc closed 3 years ago
I just ran into this with IOAbstraction version 1.7.5 and I think there is a somewhat nasty bug hidden somewhere in here that results in tasks not being run.
I was trying to add a periodic reset of my LCD code every 12 hours. My code is like this:
taskManager.scheduleFixedRate(220, readAndRedraw);
taskManager.scheduleFixedRate(60000, logData);
taskManager.scheduleFixedRate(2400, updateRelayStates);
taskManager.scheduleFixedRate(2500, exitEditingIfIdle);
taskManager.scheduleFixedRate(5000, updateFans);
taskManager.scheduleFixedRate(60000 * 60 * 12, reinitDisplay); // unsupported schedule rate
After three minutes of my arduino running only the highest priority task continued to be called (readAndRedraw
), which essentially hangs my system. If I commented out the line with reinitDisplay
everything worked fine.
It's easy to work around this with an extra layer of code that checks millis()
, e.g. taskManager.scheduleFixedRate(60000, reinitDisplayCheck);
. (code)
Anyways, just wanted to leave a note here in case anyone else runs into this. It'd be great to fix this issue or at least add a warning/error somehow when going over the 1 hour limit.
Many thanks for reporting this, it looks like there must be a problem when the schedule is over a certain size, it is probably overflowing the timer type. I will run the above code with a hardware debugger and find the exact cause of the freeze as a priority.
At the minimum, if the schedule is too large, we should probably not create the task and return the invalid task id.
Our longer-term plan for this is on the first very long schedule, EG over a few minutes, to start a watchdog task that monitors the longer tasks and schedules them as they become ready for execution. These would be scheduled within a few seconds of their intended time basically. Which when the interval is in hours, is probably close enough for most.
They would not be in the highly efficient time-based linked queue, but given they would only be checked every few seconds, that should not matter too much. This would allow very large schedules at around a second accuracy. In fact we could even provide a compiler flag that set the accuracy in milliseconds, defaulting to maybe 2500 millis check frequency for long-running tasks over say 30 minutes.
Actually, the above is probably not a lot of work, so after debugging the cause of the issue, it may be better just to fix this with a basic solution.
In fact with this kind of task, rethinking it produces a better solution. It can be implemented as an event instead of a task, events have variable timing, and can easily be triggered early (in the case of a new task arriving). So we can make it wake up very infrequently unless there's something to do. Adding a new long-running task would just trigger the event immediately where it could re-evaluate everything.
Oh, I wasn't aware of Events but my workaround does seem very similar -- i.e. check every minute, but only execute my task every 12 hours.
A quick thought: I really like the abstraction that taskManager.scheduleFixedRate
provides since it's so easy to understand, integrate, and use. It 'sells well' when you're Googling for libraries and I expect you have a lot of beginners using it to help make code more readable and maintainable. That probably also means you have a lot of users keeping their integration pretty simple (like me), so if scheduleFixedRate
were to just switch to a less-accurate mode if given a large timeout, that would totally work for me. Accuracy within a minute would be fine for my purposes.
Sorry no the event fix is for me to implement the long task support basically, so that you could just schedule as normal even tasks scheduled days away.
At the moment, both repeating and fixed tasks have to occur within around 1.193 hours of scheduling. Repeating tasks must have their interval within the same range. This is because the underlying comparison uses uint32_t to calculate the difference in micro-seconds between jobs.
The timer task type is already able to describe times in milliseconds that extend to nearly 50 days (49.710 to be more exact). We need to make task manager able to handle tasks that are scheduled well into the future.
Many people may not use such scheduling, so any additional actions needed should only be triggered into action on first use. Maybe we need to register a watchdog event on the first use of such a schedule that runs every few minutes and schedules any long-running task not yet scheduled when they are within range for scheduling.
This also keeps the run queue as short as possible.
Instead of explicitly using an unsigned 32-bit integer as the data type why not try something like this:
#define TASKVAL uint32_t
I just finished looking and there's a bunch of places where uint32_t
declarations would have to be updated to the more implicit TASKVAL
.
I also notice some relevant members of type uint8_t
that may justify another similar define macro.
This will be a little tricky as we'll need more macros or some sort of "mode" macro since C++ is a statically typed language and types must be known at compile time. This solution that's more applicable to a pure C environment may still help here with a little tweaking. I'm trying to mess around with this on a fork now.
In this way say a certain platform would only allow for uint32 (just as an example) you could then use something like
#if MY_PLATFORM
TASKVAL = uint32_t
#endif
#if SOMEOTHER_PLATFORM
TASKVAL = uint64_t
#endif
This should still work since this would allow the type to be known by the compiler at compile time.
Sorry my C++ is a bit rusty but it looks like what you did with scheduling size is what we're looking for there. Using a similar setup for Microseconds data type should allow us to work around this.
`#if defined(TM_FORCE_16BIT_SCHEDULER) typedef uint16_t sched_t;
typedef uint32_t sched_t;
Thanks for the feedback, the 16-bit support is to allow 8-bit devices based on AtMega32X to use smaller schedules. It still uses 32-bit math for all the calculations, but only stores very small schedules. We could use a typedef that was different based on architecture or choice, but I think this can also be solved using an event. That way it will work for all, including the many 8-bit users of the library who've also requested this feature.
We have a release of task manager planned for after we finish tcMenu 2.2 (probably a couple of weeks away). Our current thinking is to remove tasks with large schedules from the standard run queue and then add them to a specific long-running event. Then during the polling check, we could see if a task needs to run, and mark the event as triggered if it does. Otherwise, we can just set the time of the next check to be either the lowest time to execute the next task or the largest possible value. This means that there will be very little polling (around once an hour basically) and schedules up the maximum value in ~seconds~ milliseconds can be achieved.
This way, everyone can benefit from the change with not a great amount of additional complexity.
You can take a look at the below to see what I mean about "time of next check" and triggering events.
You see the benefit of this approach is that we can treat all long-running tasks as occurring in milliseconds rather than microseconds. This allows scheduling of tasks that are up to 49 days out, and in fact if we scheduled them in seconds instead, they could be up to years out.
Yeah, I really like that approach much better than what I was thinking of. There may still be a little benefit with what I was thinking but it's more of a band-aid patch in comparison to that.
I'm sure it's obvious that I have a big interest in asynchronous programming in embedded and well basically anything that'll squeeze as much performance as we can get out of the chips we have available now. They are getting better but the main thing is we can much better utilize the resources at our disposal now and I think this approach is going to help immensely with that.
There may still be a little benefit with what I was thinking but it's more of a band-aid patch in comparison to that.
The problem is that it would massively affect performance on a wide range of boards. Most boards do not have native 64-bit support, and even if they do, AFAIK there are very few true 64 bit processors in the embedded domain, especially ones task manager targets.
Further, there are some operations that need to be atomic (even with CAS), IE a guarantee that what we see is truly the latest, and what we write is safely replacing what came before it. These operations often do not even exist for 64-bit types, let alone on 8-bit, even on many 32-bit boards.
I'm sure it's obvious that I have a big interest in asynchronous programming in embedded
That's great, and it's good to have people such as yourself along for the ride! Another advantage of task manager is to make life easier for programmers.
Ah, bummer there's always something like that that you don't see at first. That is a lot of extra work to satisfactorily support 64-bit then! Yeah, there are some 64-bit MPU's that I use but it's quite a pain on the programming side due to immature API support. I was also messing around with Master-slave configurations with 2x 32-bit PIC and ARM MCU's and you see the bottlenecks you'd expect to see.
I'll do more research but I really want to see the upcoming enhancements you mentioned first as I know that'll change how I approach this too. I appreciate that heads up there so I'm not backtracking my work there as much.
Actually, what I expected to be many hours or even days of effort turned out to have a very simple solution. Testing it now..
1/ We create a schedule object globally
TmLongSchedule hourAndHalfSchedule(makeHourSchedule(1, 30), [] { workToDo });
2/ We register it with taskManager
taskManager.registerEvent(&hourAndHalfSchedule);
Or even if you prefer using new
taskManager.registerEvent(new TmLongSchedule(makeHourSchedule(0, 15), [] {
Serial.print(millis());
Serial.print(": Fifteen minutes passed");
}), true);
At the moment, both repeating and fixed tasks have to occur within around 1.193 hours of scheduling. Repeating tasks must have their interval within the same range. This is because the underlying comparison uses uint32_t to calculate the difference in micro-seconds between jobs.
The timer task type is already able to describe times in milliseconds that extend to nearly 50 days (49.710 to be more exact). We need to make task manager able to handle tasks that are scheduled well into the future.
Many people may not use such scheduling, so any additional actions needed should only be triggered into action on first use. Maybe we need to register a watchdog event on the first use of such a schedule that runs every few minutes and schedules any long-running task not yet scheduled when they are within range for scheduling.
This also keeps the run queue as short as possible.