Right now, if the signer crashes in the middle of processing dance step 1 or 2, it reconnects with a new client id, and broker is stuck in a forever loop trying to send the reconnect dance steps 1 or 2 to the previous client id.
Broker will also not do the reconnect dance with any clients that may reconnect afterwards, as the pub_and_wait function below never returns. As a result, it never processes any subsequent messages sent on the channel.
This is because when sending to a specific client ID (the case for the reconnect dance steps), pub_and_wait will keep sending the same message to the same client ID forever until some message is returned.
In case that message causes a crash signer side, signer reconnects with a different client ID, and hence broker will never get any response.
Right now, if the signer crashes in the middle of processing dance step 1 or 2, it reconnects with a new client id, and broker is stuck in a forever loop trying to send the reconnect dance steps 1 or 2 to the previous client id.
Broker will also not do the reconnect dance with any clients that may reconnect afterwards, as the
pub_and_wait
function below never returns. As a result, it never processes any subsequent messages sent on the channel.https://github.com/stakwork/sphinx-key/blob/9d7e8b751f8ae49a12516588ae0add0ca640e75f/broker/src/mqtt.rs#L64-L68
This is because when sending to a specific client ID (the case for the reconnect dance steps),
pub_and_wait
will keep sending the same message to the same client ID forever until some message is returned.In case that message causes a crash signer side, signer reconnects with a different client ID, and hence broker will never get any response.