The system of channels which comprise the internal plumbing of our client provide a simple contract: Control plane IPC messages are guaranteed to be delivered, but data plane messages may be dropped at the edges (in the workers) if the client is struggling to maintain the data rate.
However, there's an ambiguous case which merits further scrutiny, and it's the case where we send messages to workers from the router in upstreamRouter.toWorker and downstreamRouter.toWorker.
These are currently implemented as blocking sends. But it is theoretically possible for a race to occur where a message is routed to a worker who has entered a different state and is no longer listening to its rx channel. When this happens, the worker's rx buffer will begin to fill up. If the buffer becomes full, the system will deadlock.
We currently rely on the fact that the worker buffers are sufficiently large -- and sending a message to a since-departed worker is a relatively rare situation -- such that rx buffers probably just never fill up.
But we should provide stronger guarantees here.
Potential solutions:
workerFSM structs expose their current state to the outside world as they're executing. If we invent a convention around "active" vs. "inactive" states, the upstreamRouter and downstreamRouter can just avoid sending to a worker that's in an inactive state.
The workerFSM can close its comms channels when it departs the active state and construct new channels whenever it re-enters the active state.
The system of channels which comprise the internal plumbing of our client provide a simple contract: Control plane IPC messages are guaranteed to be delivered, but data plane messages may be dropped at the edges (in the workers) if the client is struggling to maintain the data rate.
However, there's an ambiguous case which merits further scrutiny, and it's the case where we send messages to workers from the router in
upstreamRouter.toWorker
anddownstreamRouter.toWorker
.These are currently implemented as blocking sends. But it is theoretically possible for a race to occur where a message is routed to a worker who has entered a different state and is no longer listening to its
rx
channel. When this happens, the worker'srx
buffer will begin to fill up. If the buffer becomes full, the system will deadlock.We currently rely on the fact that the worker buffers are sufficiently large -- and sending a message to a since-departed worker is a relatively rare situation -- such that
rx
buffers probably just never fill up.But we should provide stronger guarantees here.
Potential solutions:
workerFSM
structs expose their current state to the outside world as they're executing. If we invent a convention around "active" vs. "inactive" states, theupstreamRouter
anddownstreamRouter
can just avoid sending to a worker that's in an inactive state.The
workerFSM
canclose
its comms channels when it departs the active state and construct new channels whenever it re-enters the active state.See also: https://github.com/getlantern/broflake/issues/31