nsqio / nsq

A realtime distributed messaging platform
https://nsq.io
MIT License
24.89k stars 2.9k forks source link

Allow only a single client per channel per topic #1461

Closed Misiu closed 11 months ago

Misiu commented 1 year ago

I have a distributed system that has one server (nsqd instance) and multiple client applications. I use NSQ to send messages (job definitions) to clients, and then clients process jobs and report back (on different topic).

Sadly I had a situation where the same client (application with the same configuration) was started on two machines and I got an unwanted behavior, where one job was done by one client and another was sent to the second client (which didn't have required permissions and reported errors). I tracked the error and finally shut down the second app and now everything works as expected.

I started wondering if there is a built-in way to limit the number of clients per channel. After searching the docs I noticed there is a config option -max-channel-consumers with the above description:

maximum channel consumer connection count per nsqd instance (default 0, i.e., unlimited)

Can I ask for clarification?

If I have two topics, topic-a and topic-b and each topic has the same channel messages:

setting this option to 1 will allow only a single client per channel? So one client will be able to connect and read from topic-a/messages? What about producers? Will setting this option limit producers? What about network errors and reconnects? If I limit the number of clients to 1 per channel will this affect restarts/reconnects?

If someone used this option I'd be grateful for any hints.

sravzpublic commented 11 months ago

Hi @Misiu I have not used this option in production but from the code: https://github.com/nsqio/nsq/blob/9a8c304460e4bdf7361326e21c5a9fe58892c39f/nsqd/protocol_v2.go#L619 setting 1 client per channel will result in second subscriber receiving and error on SUB command, so the second subscribe will not be able to subscribe.

What about producers? Will setting this option limit producers? - Producers should not be impacted as they would be calling PUB command only.

What about network errors and reconnects? If I limit the number of clients to 1 per channel will this affect restarts/reconnects? - You mean if the only client disconnects due to an error, will that client be able to connect? By default client timeout after default 30s https://github.com/nsqio/nsq/blob/9a8c304460e4bdf7361326e21c5a9fe58892c39f/nsqd/client_v2.go#L218C55-L218C55 so probably you could do exponential backoff retry to reconnect.

In our production for similar case, message should be processed only once (idempotency) we use mongodb to verify if a message has already been processed. Limiting number of consumers to 1 will create single point of failure in the system.

mreiferson commented 11 months ago

Thanks for your question. In general, NSQ isn't appropriate as a distributed lock, which is what you're actually asking for. So we wouldn't build this "single client" feature.

Misiu commented 11 months ago

@sravzpublic, @mreiferson thank you for the reply. I'm not looking for a distributed lock feature. My use case is straightforward. Below is the simplified architecture: I have a 1:n topology: a single NSQ server, a single C# REST API server, and multiple client applications. The end user can use REST to send a command to a specific client to run a long-running task. Each client is connected to a database and is doing a SQL query. I had a situation where the same client application (application with the same configuration) was started on two different machines, and because of that 50% of tasks failed (because the client app was unable to connect to a SQL server). This was the end user's mistake because he migrated the application to a new server and didn't uninstall the old one.

Because of that, I want to avoid this kind of problem in the future.

I'm using -auth-http-address, so the app can connect to a specific channel, but I want to disallow two instances of the same client application connecting to the same channel. -max-channel-consumers seems ideal, because it will (in theory) allow the first client app to connect. When the second app tries to connect I hope to get a specific error, so I'll be able to show a proper message to the end-user what might be the cause of this. I must search for client timeout in C# client, and add retry, in case a network error or other errors occur when using a single client app. I don't need idempotency, I need to be sure that a specific message is send to a specific client and that there is only a single client app connected to a specific channel.

I hope that now my use case is better described.

If there are better solutions or any potential problems with my idea please let me know.