nats-io / stan.js

Node.js client for NATS Streaming
Apache License 2.0
293 stars 49 forks source link

How to ensure a published message's state after reconnect? #126

Closed wallride closed 5 years ago

wallride commented 5 years ago

What will happen if a connection breaks just after a message's published and before ACK is received?

Here's an example:

const guid = connection.publish('foo', 'bar', (err, ackGuid)=>{ /*....*/});
// ... wait a bit to let the message reach the server and be stored 
connection.close(); // At this point ACK handler callback remains not called.
// Then the ACK handler callback is supposed to be resolved with an error ("stan: Connection closed").

In such cases that message might has been delivered to subscribers before the publisher makes out that connection is closed and no ACK received.

The only obvious way to resolve the actual state of the published message is to get it from server by guid. Is it possible? What ways to solve the problem can you suggest?

aricart commented 5 years ago

If the client exits before the ack is received, but the message reached the server, the message is on the server. However, the client won't ever get a confirmation. This means that if you get an ack you can be certain that the server got the message. Otherwise, your application must assume the message failed and should resend.

The GUID returned on a publish operation is useful if you are debugging the stan server with --SDV and looking at the logs.

With the above said, this means your messages should be idempotent, and your client should store the payload safely until an ack is received. If duplicate messages are a problem, you might consider calculating a hash on the client and then having subscribers ignore messages that have matching hashes that have already been processed.

In the case where your application is actually closing the connection, you would need to implement some sort of drain protocol, which means that you shouldn't be closing the connection (or subscription) until after any acks you are expecting back are received.

wallride commented 5 years ago

@aricart , many thanks for your explanation. Sad news. Impossibility to ensure the message state after reconnect is the reason we broke up with rabbitmq and refused kafka. The microservice platform we are building MUST NOT duplicate messages on server restarts or connection breaks. And repeating publish must occur only when the message has not been durably accepted by broker. Just because the consuming client might not belong to us.

So, it seem there's no other way to find out the actual message state but asking the server for it.

May we hope that this kind of feature will appear in API in future?