zephyr-im / zephyr

An institutional/enterprise-scale distributed real-time messaging and notification system
34 stars 11 forks source link

Fix realm boot/shutdown handling #109

Open andersk-auto opened 10 years ago

andersk-auto commented 10 years ago

Ages ago, someone put #ifdef REALM_MGMT around two important pieces of code related to conveying information between realms about when a server is starting up or shutting down. Unfortunately, nothing is or ever has been built with REALM_MGMT defined, which means this communication never happens. Sort of...

realm_shutdown() sends notices to each foreign realm that this server is going away, along with a suggestion of another server to switch to. The sending of these "realm deathgrams" is conditionalized, but they are always processed when received. Incorrectly-copied comments notwithstaning, a realm deathgram is not the same as an HM deathgram, and does not cause any subscriptions to be removed. Rather, the only processing that occurs is to ACK the message and potentially switch to a new server.

At the other end of the universe, realm_wakeup() (or rlm_wakeup_cb()) is used to notify foreign realms that a server has come up. This allows other realms to switch servers, if necessary, and to resend any of their subscriptions to triples in the realm that just booted. These messages have the reverse problem -- REALM_BOOT notices are always sent, but part of their processing is conditionalized. Particularly, servers receiving a REALM_BOOT always consider switching servers, but if REALM_MGMT is not defined, they do not resend subscriptions _or even ACK the REALMBOOT notice (or forward it to other servers). This, of course, means that REALM_BOOT notices end up being retrasmitted repeatedly until all timeouts are exhausted.

At a minimum, we need to ACK and forward REALM_BOOT notices. We should also start sending realm deathgrams when appropriate. I think these are both fairly straightforward changes with easily-understood consequences. I'm somewhat less sure about resending other-realm subscriptions, whose utility is obvious but whose safety is less so.

andersk-auto commented 10 years ago

Imported from trac issue 109. Created by jhutz@CS.CMU.EDU on 2013-03-18T20:22:47, last modified: 2013-10-06T21:18:49