I've noticed that when the sister is killed quickly after being started, there is (very much repeatable) chance to deadlock the other Bond. Consider this sequence:
Sister sends her first status message. My bond executes AwaitingSister->SisterAlive() which calls Connect(). Connect() stops the connection timer.
Sister dies before sending her second message.
My bond is now in Alive state. However, Heartbeat() has not yet been called (that would be done by the second message which did not come), so heartbeat_timer_ is still off and does not trigger the timeout event.
The fix I sent fixes it on the C++ side. I'm not sure about the Python side, and not sure about the SM source code. The Heartbeat() needs to be called already in the Alive state (it is undefined in AwaitingSister). Quickly skimming through the SM syntax, it doesn't seem to me it would support calling some functions before the transition and some after it.
If that would be the case, I'd suggest calling Heartbeat in the AwaitingSister state (right after calling Connected()) and defining it for AwaitingSister state to do the same as for the Alive state. That would require no additional changes to the SM definition, so it might be preferred.
I'll let the maintainers decide which approach would be better.
I've noticed that when the sister is killed quickly after being started, there is (very much repeatable) chance to deadlock the other Bond. Consider this sequence:
AwaitingSister->SisterAlive()
which callsConnect()
.Connect()
stops the connection timer.Alive
state. However,Heartbeat()
has not yet been called (that would be done by the second message which did not come), soheartbeat_timer_
is still off and does not trigger the timeout event.The fix I sent fixes it on the C++ side. I'm not sure about the Python side, and not sure about the SM source code. The
Heartbeat()
needs to be called already in theAlive
state (it is undefined inAwaitingSister
). Quickly skimming through the SM syntax, it doesn't seem to me it would support calling some functions before the transition and some after it.If that would be the case, I'd suggest calling
Heartbeat
in theAwaitingSister
state (right after callingConnected()
) and defining it forAwaitingSister
state to do the same as for theAlive
state. That would require no additional changes to the SM definition, so it might be preferred.I'll let the maintainers decide which approach would be better.