Open robinbraemer opened 4 years ago
Wondering why the client couldn't subscribe to the correct subject to receive the data it needed. Typically subject-based applications don't replicate data into different subjects. An example of why this would be useful would be helpful.
Say I have a common subject structure , like : notif.
alternative: ask client 001 and client 003 to subscribe to subject that only relevant to them,such as notif001003 , but this grouping need to be known beforehand , rather than publisher can decide dynamically at runtime
Say I have a common subject structure , like : notif.
If I want to send message to particular client, I publish to notif.001 If I want to publish to all client, I can design the client to subscribe , say "broadcast" subject But If I want to publish to notif.001 and notif.003 but not to notif.002 I have to publish twice alternative: ask client 001 and client 003 to subscribe to subject that only relevant to them,such as notif001003 , but this grouping need to be known beforehand , rather than publisher can decide dynamically at runtime
Thats the exact use case I face. @aricart
How big are your messages? I very often publish to 50k or more subjects and it’s fine. Takes very little time or resources. My pattern is very similar to yours.
What also sometimes make sense is to send messages to nodes where they decide to ignore the message - I don’t know your use case but for me this works very well as my system lets me attach a filter
My messages are very little, about 1k and in my system the nodes also ignore the messages where the target is not this node.
So unless you are over large latency or 100s of thousands of messages it will be fine. Are you experiencing issues or just suggesting an improvement? In my system I connect a publisher connection and receivers - publishing takes less than a second for 50 000 messages
I agree it’s a good feature to add but realistically we won’t be changing the protocol for quite a while to enable this
I just suggested an improvement/feature. :)
Is the membership for notifications totally dynamic, could be an N from a set of M or is their a pattern?
We need a similar feature for our cloud - that we need to have somehow grouping subjects where will be sent a message. Something like:
Via nats API, we want to define subjects where will be messages distributed - (Add/Remove destination via nats client). By default, it is the same as is published subject.
@derekcollison What do you think?
We have account subject mappings which allow traffic shaping that could be adjusted to allow sending to multiple subjects.
mappings = {
foo: [ { dest: bar, weight: 40% }, { destination: baz, weight: 20% } ]
}
So in the above we send to foo
which will be sent to bar
40% and baz
20%.
Currently these need to not add up to > 100%. It can be less to introduce loss for chaos monkey style testing.
We could allow something like this.
mappings = {
foo: [ { dest: bar, weight: 100% }, { destination: baz, weight: 100% } ]
}
or more simply..
mappings = {
foo: [ bar, baz]
}
Mappings are changeable with server reload or JWT updates, but probably do not expect alot of changes etc. Would this possibly work?
Hello @derekcollison
I would like to explain bit more what we are doing with @jkralik and what challenges we're currently facing. We are building https://github.com/plgd-dev/cloud open-source IoT system using NATS as a messaging system. Our subjects are organized as follows:
events.devices.{deviceID}.resource-links.{eventType}
events.devices.{deviceID}.metadata.{eventType}
events.devices.{deviceID}.resources.{resourceID}.{eventType}
Let's assume we have a deployment with few users while each of them have many devices (10000). If he is interested in websocket/grpc stream notifications from all his devices, we check in our authorization service which devices belongs to this user and start 10000 subscriptions (as they are organized per deviceID). Do you see an issue with this approach - wasting of resources? Is it worth to improve it?
We think it would make sense to organize subjects also per users ids. But there is a another overhead linked to doing the publish n+1. If one device is shared with other 1000 users, service would have to publish data to events.devices.{deviceID}
as well as to events.users.{userId-1..1000}.devices.{deviceId}
. Do you agree? Is it even good to organize subjects in such a way?
What could solve this n+1 publish overhead is automated mapping of subjects in NATS server based on the subscription. That means, subscriber has it's JWT token which we have from the grpc / websocket connection (northbound client subscription). Based on the value from the token (e.g. sub
claim), NATS Server could create a mapping from events.devices.{deviceId-1..10000}.>
to events.users.{userId}.>}
. If the subscriber stops subscription, mapping could be removed.
But still, the question is if this optimization makes sense, if our thoughts are going the right direction.
Thank you
(we need to keep events.{deviceID}
subjects for southbound systems donig data-mining, whic hare not aware of users).
There is a bit of an art form to designing subject ontologies. You should, imo, design the system to publish events once, but I would need to dig in a bit more to offer any guidance.
FYI NATS supports Websockets directly, so no need for websocket/grpc, just use NATS ;)
You can control permissions to which users can access which subjects etc, you could possibly link these to a cross list of subjects that a user has access to.
Meaning I would focus on the interest graph that is represented by each user when they log in, the set of subscriptions as you mention above.
@derekcollison sure, NATS provide nice features, but internal messaging design - contract and subject organization shouldn’t affect in any case public API evolution driven by the business usecases. So for me this tight coupling between north and southbound interfaces is not very acceptable.
Agree that the system should be designed in a way we have events only once in the system - in one subject. That follows approach of modeling events around entity they represent. Is this the right approach? Or should you model subjects around its subscribers?
How expensive is the subscription @derekcollison ? If you have one client, how much does it matter if we subscribe to 1 or 1000 subjects?
You can control permissions to which users can access which subjects etc, you could possibly link these to a cross list of subjects that a user has access to.
But which user has access to which device is dynamic. This can change during subscription. So we need to inform either our publishing service to start publish the data to another user topic (the same event) or reconfigure nats mapping dynamically.
Example, I am subscribed to notifications of all of my devices. But another user just shared his own device with me. This information is published to another topic on which our publisher would be listening and would start to publish data from this new device to that subject. Or reconfigure NATS to route it also to different subject. Dynamically.
But imho, it shouldn’t be organized around users at all.
Subscriptions are lightweight, so not an issue from that standpoint.
I would need quite a bit more information to form a complete opinion and suggestion on how to architect.
@derekcollison as discussed on slack, I'd just like to add my use-case here. You proposed a solution above on Sep 29, 2021 that would be very useful by allowing mapping to multiple subjects. In this case static server config would be all that is needed.
The use-case is that messages from IoT devices are grouped and users subscribe to the group instead of individual devices. Users don't know ahead of time what devices are in their group. Groups and memberships are managed by an administrator outside of nats and converted in nats configuration.
When Iot Devices publish their events, these events are mapped to a group subject based for the groups they are a member of: For example. 3 devices and 2 groups:
Event mapping:
Users are allowed to subscribe to the group if they have group permissions. This should also work with streams where a stream would be for a group subject. Stream group1 would have subject "group1.>".
Alternatively, an even better approach would be to allow streams to overlap subscription. I don't quite understand why that is currently not allowed but it would be handy to define streams that can contain the same subject. In this approach a stream would be the group and the stream subscriptions are the devices that are a member of the group.
You are referring to cumulative weighted mappings above 100% correct?
Yes
ok will see if we can do something for 2.10. We still have a bit of work to do for it already for existing customers and we are behind schedule as well.
I very much appreciate it!
You can already do something like this in 2.10 by using the new stream sourcing features that it has. You can for example have one stream that gets all the messages from the devices and then you create a stream per group (that sources from that initial stream). You can then easily control access to each of those group streams.
@jnmoyne that is very interesting and sounds like it should work. I'll give it a try.
@jnmoyne When creating a stream with multiple StreamSource on the same source stream and different FilterSubjects, the second StreamSource is ignored.
When testing this configuration:
Then only events.device1.temperature is received. If I swap the order then only device2 events are received. Looks like the source stream ('events') cannot overlap. This is with nats-server-2.9.18 and nats-go-1.27. You did mention 2.10 however, does that mean this only works for nats-server-2.10?
ps: I feel like I'm hijacking this thread. Should this be a separate issue?
edit: I tried with the latest 'main' branch but the same behavior occurs. No error is reported though. It looks like it is possible to have a stream source multiple subjects from the same stream, or have a stream subscribe to overlapping subjects, or have a single subject map to multiple different subjects (yet).
Yes you need 2.10 for this to work (ie try with the current top of the dev branch). You also need to use the 2.10 branch of the nats.go client library.
@jnmoyne @derekcollison It works. Totally awesome! Thank you both so much for your help. I'll do my dev on these branches until 2.10 is released.
Feature Requests
I think it would improve usability, efficiency and latency when we can publish bytes of
data
to multiple subjects in one request and the nats server spreads out the message to all targeted subject, instead of uploading thedata
to each single subject.Use Case:
obvious
Proposed Change:
No changes but additions.
Who Benefits From The Change(s)?
Alternative Approaches
data
to all subjects in a loop (more traffic, more latency, more error prone)