nats-io / nats.rs

Rust client for NATS, the cloud native messaging system.
Apache License 2.0
980 stars 159 forks source link

Race condition with multiple connections #1274

Closed nazar-pc closed 2 weeks ago

nazar-pc commented 3 weeks ago

Observed behavior

2 applications establish 2 connections each.

Application A subscribes to a subject through one connection and sends a message to another app (potentially through a second connection).

Application B receives a message and publishes 2 messages to subject that application A just subscribed to.

What I observe is that sometimes first messages that B sends to A is lost even though subscription on A was definitely established successfully prior to sending a message to B, which only then will publish a message to A's subscription.

Not don't think this is an expected behavior or if this is a bug in NATS server rather than Rust client, but I wanted to report it either way.

Expected behavior

Regardless of which connection subscription was created on, messages that are send after subscription was created should be delivered successfully.

Server and client version

nats-server: v2.10.14 async-nats: 0.35.0

Host environment

Docker CE 5:26.1.3-1~ubuntu.24.04~noble on Ubuntu 24.04 with official NATS container image from Docker Hub

Steps to reproduce

No response

Jarema commented 3 weeks ago

Hey. NATS protocol in its nature is asynchronous. In the context of subscription it means that the successful call to subscribe does not mean it's a registered subscription on the server. It means that the call was successfuly dispatched.

That means, that if you have two separate applications/connections, that in tight time conditions between subscribe and publish may not work.

The same pattern always works over single connection, as NATS commands (pub,sub) are queued and sent in order, so if you subscribe and immediately after you publish - that should work just fine.

What worries me a bit however, is that you first subscribe and send the message from A to B, which is mostly how request/reply pattern work in NATS. However, we never had issues with that. Can you maybe share a short reproduction snippet so I can investigate?

nazar-pc commented 3 weeks ago

In the context of subscription it means that the successful call to subscribe does not mean it's a registered subscription on the server. It means that the call was successfuly dispatched.

I don't think I understand what the different about successful dispatch and successful subscription means in this context. My expectation was that if I get subscription stream from the library then subscription actually exists on the server already.

I do not have a short snippet and it only happens sometimes, so as you said, it is very time-sensitive. According to the user single connection indeed avoids this issue, so it only happens with messages sent over independent connections to the server.

Jarema commented 3 weeks ago

As said, NATS is asynchronous in nature. Core NATS is NOT a request-reply pattern.

When client subscribes, it just sends a SUB call to the server. If everything went well, that's all that happens. If server cannot register the subscription, it will send asynchronously an error that you can see on event_callback on the connection.

We're working right now to have those errors localised in a way that you can get them on a callback (or stream, TBD) defined on subscription struct, but that does not change tha async nature of the protocol.

It's worth noting that this allows NATS to be as performant as it is.

nazar-pc commented 3 weeks ago

My expectation would be that server would send either ack or error back to the client telling whether subscription was successful or not and only then .subscribe() future would resolve on Rust client side.

derekcollison commented 3 weeks ago

The low level protocol has an option to send acks for all protocol messages but that is never used in clients. The error will be delivered async (if there is one)..

nazar-pc commented 3 weeks ago

I understand that is the case right now, but from user experience this is not quite matching expectations I had. Feel free to close if this is "wontfix", but I believe it would be an improvement to wait for subscription creation acknowledgement before resolving the future.

Jarema commented 2 weeks ago

@nazar-pc there is not subscription acknowledgement send by the server unless verbose mode is used. None of modern NATS clients are using verbose mode, as it has a substantial performance impact.

Thanks for creating the issue and discussing it. I'm closing it, however if you'd like to discuss it further, feel free to reopen or continue the the discussion.