Open luca-della-vedova opened 1 month ago
This definitely sounds like a bug.
If I manually add a || true here making this condition always evaluate to true,
This is going to force the waitset to always be rebuilt, which is not good for performance. It's likely that the bug is somewhere else (in how timers are added?)
I probably won't get a chance to look at this until after roscon, but thank you for the full repro example and detailed steps.
Yes, I agree that this sounds like a bug.
So I suspect there is something about how timers (and maybe other entities?) are added to wait sets inside lifecycle node transitions that creates this issue.
I wouldn't exclude that this is "timers-specific" since timers are handled in a slightly different ways from other entities. Moreover, lifecycle nodes do not do anything special here, so the same problem should be reproducible in a regular node (executors don't really care about nodes, they only care about callback groups)
lifecycle nodes do not do anything special here
My assumption as well is that something about the execution order from lifecycle nodes is surfacing this, but it's not something that would be unique to lifecycle nodes.
Hi,
I discovered this bug five days ago at https://github.com/fmrico/cascade_lifecycle/pull/14, and now I am solving a problem that is directly related at PlanSys2 (https://github.com/ros/rosdistro/pull/43212#issuecomment-2418582680). I am solving this now by changing the SingleThreadedExecutors to EventsExecutors in some cases and removing the node before destroying it in other cases.
Basically, if I have two nodes in an executor, each one with a timer, if I destroy one of the nodes, the timer callback in the other node is no longer called.
I can investigate more if you want.
Bug report
Required Info:
Steps to reproduce issue
Expected behavior
Since the service callback destroys only
timer_
,timer2_
should keep running andHello, world!
should keep printing.Actual behavior
All timers are stopped and no
Hello, world!
is printed.Additional information
The following might shed some light into this.
timer2_
is only created aftertimer_
is destroyed) the issue does not arise.|| true
here making this condition always evaluate to true, all timers are executed, however I added some printing when timers are cleaned up / added and I can see that every iteration a timer gets added and removed, so something odd is going on there.So I suspect there is something about how timers (and maybe other entities?) are added to wait sets inside lifecycle node transitions that creates this issue.