Limit number of filter subjects on consumers

ripienaar commented 3 weeks ago

Proposed change

Create limits for multi subject filter consumers:

Maximum to 100 subjects
Create a server configuration option to override that limit of 100
Hard maximum limit that configuration cannot move of 1000

Use case

The multi subject filtering feature should not be used to list 100 or 1000s of subjects, better subject design should be done instead.

Limits will communicate our intent with this feature

Contribution

No response

ripienaar commented 3 weeks ago

@derekcollison as discussed last week I opened this, but we didn't discuss the idea of hard maximum around 1k subjects but this seem like a good idea to me what do you think?

nickchomey commented 3 weeks ago

I've been planning to use NATS for an SSE mechanism, where browser clients would create an http sse connection and then a single nats filtersubjects consumer (or kv multi watcher, when that's available in all of the SDKs) will subscribe to all connected users' specific subjects. When an update comes in for that subject, the SSE conemction will send out the message to that client.

It seems to me that this would be much more efficient than creating a new consumer/watcher per connected client.

If this hard limit were to be implemented, I wouldn't be able to use this for more than 1000 connected users. I assume you have good performance reasons for doing so though.

Do you have a suggestion of how I could model and handle a scenario like this?

ripienaar commented 3 weeks ago

@nickchomey the reason for these limits is because its just not working at all with big sets of subjects but also as stated a better design should be made. Even at a thousand I think you'll have significant issues.

You could perhaps use an ephemeral to get historical data for a specific SSE client and then from there use the stream repub feature to handle new messages without the need for consumers. Or use the new batch direct get work in 2.11 to do the initial catchup or similar idea

nickchomey commented 3 weeks ago

Thanks for the response! Are you suggesting creating an ephemeral consumer for each connected browser client? Wouldnt that require more resources than just one consumer than subscribes to one subject per connected user?

I just read the Repub doc, but it isn't at all clear to me what it does at all, let alone how it might be helpful in this context. Would you mind elaborating a bit more?

Direct get (is this the right place to read about it?) seems like it would require a sort of polling architecture, rather then just having changes pushed out to browser clients automatically.

I can't really think of a better subject design - each user will have their own subject with sub-subjects. Rather than subscribing to all users and receiving an immense amount of irrelevant messages (if there were say 20000, let alone millions, of users), it would just receive those which are currently necessary. But perhaps I just don't have enough imagination on how to better model something like this?

Or perhaps the unnecessary bandwidth usage and cpu load of filtering out all of the unnecessary messages in the nats client is less than the cpu load of filtering within the nats server?

Or is this just a use case that NATS can't really handle?

ripienaar commented 3 weeks ago

Does your clients need historical data when they start? Use an ephemeral / direct just to get the history and then remove the ephemeral. So it's a short lived consumer to get historical data. From that point the repub feature will deliver to you any current updates.

Republish runs on the stream and as its stores a message will also publish it to a subject on core nats where a normal subscribe can receive it. Together with the data from the message you also have some sequences to help you detect any gaps which can be filled with direct get if you need. If you are mainly doing a tail -f style follow on a stream this is way to efficiently (no consumers) just follow along to just the subjects you need without wasting resources on the overhead consumers carry when really your use case don't need them even. This will scale to many many clients following your design - just be careful to handle restarts or reconnect storms gracefully under a control of max concurrency.

nickchomey commented 3 weeks ago

Ah, ok, i think I understand the root of the issue now! The problem is with the weight of jetstream, not nats as a whole.

Repub creates a normal/core nats pubsub subject rather than jetstream subject. That wasn't evident to me from the Repub doc https://docs.nats.io/nats-concepts/jetstream/streams#republish. Perhaps it could be improved for clarity, as well as have a link to the core nats subject documentation?

Also, perhaps this could be included in a performance/scalability-specific doc? Some general stats on core vs jetstream performance/scalability would be particularly helpful - I suspect many are inclined to use Jetstream by default, when maybe they should be more judicious with it.

Thanks very much! Direct get + Repub should work well for this use case!

nats-io / nats-server