hyperledger / indy-plenum

Plenum Byzantine Fault Tolerant Protocol
https://wiki.hyperledger.org/display/indy
Apache License 2.0
215 stars 370 forks source link

We had a node that could not connect to indy network due to zmq.Again errors. #1643

Open KimEbert42 opened 12 months ago

KimEbert42 commented 12 months ago

We had a node that could not connect to the indy network. We saw the message in the log file on the node it was trying to connect to

aaaaaa could not transmit message to bbbbbb

This message kept repeating.

This appears to be due to a zmq.Again error.

This indicates that zmq PAIR connections may not recover from networking errors.

https://stackoverflow.com/questions/49222514/recovering-from-zmq-error-again-on-a-zmq-pair-socket

We only log a message on error, which will not recover the TCP connection

https://github.com/hyperledger/indy-plenum/blob/main/stp_zmq/zstack.py#L865

A work around is to restart the node that has displayed the message and the node attempting to connect will be able to connect.