prdn / pigato

PIGATO - an high-performance Node.js microservices framework
http://prdn.github.io/pigato/
MIT License
301 stars 31 forks source link

Should Disconnect and hearbeat be symmetrical ? #31

Closed maxired closed 9 years ago

maxired commented 9 years ago

Hi,

I am wondering whether we should change the behavior of heartbeat and disconnect to be symmetrical.

Let me explain :

The goal I am trying to pursue here, is to be able to build a callback system on top of connection/deconnection event, so I never have to setup timeout as a user of the library.

Please let me know your opinion about all of theses.

prdn commented 9 years ago

Hi,

I agree on the Client/Broker heartbeating. My last commit fixes that. Also the client now emits 'connect' (on start or reconnect) and 'disconnect' (on stop or disconnect) events. Not sure if the Broker needs to know if a client is disconnected. I don't think it should care about that and ZeroMQ should discard messages after some time (ZMQ_LINGER socket option).

When a Worker calls 'close' a 'disconnect' message is issued to the Broker. So the Broker knows that the Worker is not alive and the Worker knows that himself is no more connected to the Broker. Am I missing your point here?

maxired commented 9 years ago

Hi,

thanks for you answer and commits. My point was to say that, since we can't be sure that a message has been correctly sended and received, maybe we should send application level acknowledges.

For example, I was thinking that the 'connect' event on the client might be emitted only after the first hearbeat answer received from the broker.

When a Worker calls 'close' a 'disconnect' message is issued to the Broker. So the Broker knows that the Worker is not alive and the Worker knows that himself is no more connected to the Broker.

In fact, in the current implentation, https://github.com/prdn/pigato/blob/44cbd29fd2deeac44ae1089406b144b5ab33e706/lib/Worker.js#L40L43 , the worker is supposed to send a 'disconnect' message, but the next instruction is to close the socket. This means that the message is never issued to the Broker. If we wait for an acknowledge of the W_DISCONNECT, this will not happen.

prdn commented 9 years ago

Hi,

zeromq leave to the developer the logic to handle a disconnection.

One problem with a worker that waits the ack from a broker for its disconnection is that we should not close worker socket until we receive that. Is this something that we want?

The broker implement a requeue solution. When it finds that a disconnected worker has in flight requests it requeues them. That in my opinion is transparent and fault tolerant.

Then I do not want to add too much logic on the client / worker side.

We can move the connection events after the first heartbeat has received. That doesn't prevent to send messages even before that because the linger option takes care of the send queue.

maxired commented 9 years ago

Hi

One problem with a worker that waits the ack from a broker for its disconnection is that we should not close worker socket until we receive that. Is this something that we want?

I don't see any problem with that. We should also probably add a timeout in case the broker doesn't answer.

I otherwise agree with all of your message. Again, in the current implementation, the disconnect message is not sended. The requeue solution works fine but I think it add latency. I'll try to do a unit test to demonstrate that.

prdn commented 9 years ago

Think we have closed with b60ee98ab962bd8d7d5efcdff2e0a5ea8d29262e