This is a pretty classic 3-way exit signal race condition where we technically log the wrong reason:
X does ownership checkout
X link/monitors Y
Y gets allowed on connection owned by X through allow or shared mode
Y does connection checkout
Y exits and sends exit/monitor signal to X
X exits due to signal from Y
X sends monitor signal to Ownership.Owner process
Ownership.Owner receives :DOWN from X, logs owner exited (but it was client that triggered exit) and shuts down
Y sends monitor signal to Ownership.Owner process
OR
X does ownership checkout
Y links to X
Y gets allowed on connection owned by X through allow or shared mode
Y does connection checkout
X exits and sends exit signal to Y
Y exits due exit signal from X
Y sends monitor signal to Ownership.Owner process
Ownership.Owner receives :DOWN from Y, logs client exited (but it was owner that triggered exit) and shuts down
X sends monitor signal to Ownership.Owner process
Fortunately we disconnect in either case so the pool is secure. However it might be that we can give better information or at least try to prevent a user from being mislead that a different process caused the crash.
When we have 2 monitors active (one on owner and one on client) and we receive a :DOWN we could use Process.alive? to see if the other process is still alive and use this in the message. If the other process is not alive we can block to receive their :DOWN and log both. If the process is alive it might be helpful to add that information to (e.g. "(with client #PID<..>)"). We may also want to special case the situation where the owner is the client.
This is a pretty classic 3-way exit signal race condition where we technically log the wrong reason:
OR
Fortunately we disconnect in either case so the pool is secure. However it might be that we can give better information or at least try to prevent a user from being mislead that a different process caused the crash.
When we have 2 monitors active (one on owner and one on client) and we receive a
:DOWN
we could useProcess.alive?
to see if the other process is still alive and use this in the message. If the other process is not alive we can block to receive their:DOWN
and log both. If the process is alive it might be helpful to add that information to (e.g. "(with client #PID<..>)"). We may also want to special case the situation where the owner is the client.