Open aj062 opened 7 years ago
It looks like crossbar closes the connection in case of protocol error. I would say this is a crossbar/autobahn issue. It is very difficult to properly recover from a connection lost error, as there is a lot of race condition there, as you can lost any number of message during the reconnection. This is why buildbot just shutdown the master in case of disconnection. This is not impossible task though. any help is appreciated, as well as proper testing.
I think in order to quickly solve your issue, you can just fix your buildernames to be identifiers change: "MyBuilder Release Builder" to "MyBuilder_Release_Builder"
I am trying a simple multi-master configuration, with two master, one running web-server, and another handling rest of the stuff.
Buildbot webserver stopped abruptly (with below message in the logs). It seemed to happen just after there was a API request with invalid builder name.
http.log in buildbot directory:
Note: Multiple similar API requests with invalid builder names are present in the logs, but those didn't cause buildbot to crash.
twistd.log:
Crossbar also seems to have below error (in /var/log/messages) at exact same time, but seems like crossbar continued to work.
/var/log/messages :
Buildbot should be more reliable and shouldn't crash.