nats-io / nats-server

High-Performance server for NATS.io, the cloud and edge native messaging system.
https://nats.io
Apache License 2.0
15.47k stars 1.38k forks source link

Provide a more granular API for stream management (resolve concurrent updates) #2746

Open OpenGuidou opened 2 years ago

OpenGuidou commented 2 years ago

Feature Request

Use Case:

With the current API, if we need to dynamically update streams at application runtime, we need to provide the full stream configuration. If several application instances update the same stream at the same time, the latest received will override the previous ones.

My real life use case is an application that needs to add and remove dynamically subjects from a stream. If several instances of this application are running in parallel, with a high message rate, concurrency can occur for the stream updates.

Proposed Change:

Provide a more granular stream change API. For example: add subject to stream, remove subject from stream, or change any part of the stream configuration on its own, without providing the full configuration.

Who Benefits From The Change(s)?

Jetstream API users

Alternative Approaches

An alternative approach (more of a workaround) would be to centralize all stream updates via a dedicated service on application side. But this would create a potential bottleneck.

gedw99 commented 2 years ago

This would be the answer to my prayers !!!

Runtime ability to modify the system and to get an event of the change to the system back . Over nats ironically - like a control plane essentially .

Is this synergistic to your proposal? I sort of extended / embellished on top .

ripienaar commented 2 years ago

You can update it at runtime and change configuration but doing it super frequently can cause concurrent change issues

so the question is why would you need to update it that frequently. You say that’s what your application needs but why? Why would for example a wildcard subject space not also solve the problem ?

OpenGuidou commented 2 years ago

The use case would be an extended "replyTo" over a microservices architecture.

The goal would be to reconciliate a rest endpoint with a galaxy of microservices using Jetstream. It is simplified in a single "App 1" in this picture: Untitled Diagram drawio

The rest server adds a dynamic subject to the stream to wait for the message it wants for the current request, and provide it somehow in the message in subject 1 so that it's propagated until the application that will need to return the final answer.

I understand this could also probably be done by using some UUID in some part of the "reply" subject and create a subscription with this filter.

I took this as an opportunity to propose this feature, as it would also allow applications to add / remove subjects to a stream without having to retrieve each time the entire current configuration from the server to include the other fields needed.

ripienaar commented 2 years ago

Very frequent stream configuration updates is definitely an anti pattern, so would rather find a different solution tbh. I would much rather make the stream in question listen for lets say replies.app.> and set in the messages a headr telling it where replies should go.

However, in this use case, there's no need for jetstream. JetStream is there to provide a temporal decoupling, here you have a temporally coupled service. Just use core nats for that.

OpenGuidou commented 2 years ago

If I understand correctly JetStream also provides resiliency with replication in case of disaster (one server instance goes down). That's the only feature that made me consider JetStream for this use case, but I'm also considering both alternatives you're proposing (nats with dynamic subjects or single subject with header in message and filter in subscription).

If massive stream updates is a bad idea, I'll just cross off this solution :smile:

gedw99 commented 2 years ago

Thanks . Learned a lot from that.

i have 1000’s of workers and new ones starting up every second. they each need to be part of a data flow that is changed every few seconds. So lots of flux .

So worker 02 gets inputs from worker 01 . Worker 2 outputs to worker 03. And so one .. Which worker is connected to which worker changes every few seconds.

The group of workers can use the same stream and just use wildcards within that stream because that grouping is their own security context.

between groups of workers and so hence streams I need to late bind and so need Jetstream. The same semantics between workers holds true for worker groups too. The security betweeb worker groups needs to be asserted.

i need to then create new streams for each worker group thousands of times a second as the system reconfigures itself . Like a control plane. The engine deciding the changes is a recurrent AI doing learning based on feedback from the results. It then reconfigures the worker mapping and the worker group mapping as it pleases.

advice on this ?

ripienaar commented 2 years ago

If I understand correctly JetStream also provides resiliency with replication in case of disaster (one server instance goes down). That's the only feature that made me consider JetStream for this use case, but I'm also considering both alternatives you're proposing (nats with dynamic subjects or single subject with header in message and filter in subscription).

If a server goes down connected clients have to disconnect and reconnect elsewhere and then continue along their way - this does not change with JetStream. At worst if left does not get a reply after a timeout, just send the request again - but if your app 1 is scaled horizontally around a queue group it will deliver reliability as well.

What you're describing is literally the sweet spot for nats without jetstream, its what it does best. Because the problem is already extremely well solved JetStream provides additional non request-reply features. Which is why you are finding using it for something like this a bit awkward.

If massive stream updates is a bad idea, I'll just cross off this solution 😄

However, having said all above, simply using wildcards will solve the problem completely for you and avoid all the updating of subjects etc, its entirely unnecessary. Still, using request-reply as per core nats is much much more efficient in solving this use case.

OpenGuidou commented 2 years ago

I meant resiliency in terms of no messages being lost during the server failure / reconnect to another.

For the core nats request/reply, the only thing is that if we don't see App 1 as a single application but a galaxy of microservices talking to each other, we need to forward the initial "replyTo" in all of them so that the one that will reply knows where to send it.

derekcollison commented 2 years ago

Distributed Queue Groups may help here with core NATS.

https://docs.nats.io/nats-concepts/core-nats/queue#queue-groups