dotnet / orleans

Cloud Native application framework for .NET
https://docs.microsoft.com/dotnet/orleans
MIT License
10.06k stars 2.03k forks source link

Streaming questions #4198

Open attilah opened 6 years ago

attilah commented 6 years ago

This is not an issue, but questions. The answers could be useful for a streaming FAQ:

galvesribeiro commented 6 years ago

Streaming implementation does not fan out to all subscribers, but calling and awaiting them one-by-one (if not fire and forget specified). For me fire and forget is a misleading name, but documentation states what it does. Perhaps getting all the Tasks then doing a Task.WhenAll would be better. I did not see the possible reasoning behind this, is there one?

IIRC that only happens with SMS.

Am I right that the serialization of the OnNext payload happens for every subscriber and not just once? For multiple subscribers it could be a great overhead.

I think it is true. Someone complained about it a while ago. I don't remember if on an issue or on Gitter tbh...

xiazen commented 6 years ago

Streaming implementation does not fan out to all subscribers, but calling and awaiting them one-by-one (if not fire and forget specified). For me fire and forget is a misleading name, but documentation states what it does. Perhaps getting all the Tasks then doing a Task.WhenAll would be better. I did not see the possible reasoning behind this, is there one?

I believe SMS fan out to all subscribers through a Task.WhenAll if fireAndForget was turned off. But not persistent streams. For persistent streams, there's complications because persistent streams supports per subscription customization, such as stream filters and subscribe to oldder items through a sequence token. So for persistent streams, we cannot do a simple fan out, because some items are not supposed to sent to some consumers but should be sent to others. I'm not saying optimization cannot be done here. But just provide some context regarding your suggestion. Are you experiencing performance issues with this behavior or ?

xiazen commented 6 years ago

For ImplicitStreamSubscriptions there is a great amount of housekeeping that the developer must do to subscribe to its own stream and unsubscribe, resume and such. Can it be enhanced to a degree to get these done by the runtime and not by the developer? It is easy to miss and there is no sample code on how to do streaming (right).

Can you be more specific on what house keeping stuff is bothering you? A good example could be seen in #3917, where it brough up the annoy process of getting a stream reference. And why the grain needs to know stream provider name to get s tream reference, it would be good to be able to get the stream ref by just its id. More concrete pain points will help us understand the problem.

xiazen commented 6 years ago

Am I right that the serialization of the OnNext payload happens for every subscriber and not just once? For multiple subscribers it could be a great overhead.

I think payload was just serialized once before they are sent to the queue as a batch. what makes you think they are serialized per subscriber? Do you mean deserizlization?

darthkurak commented 5 years ago

I wonder if I can join this discussion, as I also have a few questions about streaming.

  1. ImplicitSubscription works in way that it activate grains which have implicit subscription to the namespace with Id of streams in that namespace. That means that producer has to known identities of consumers. So it seems more like Notify than Subscribe. In that way we cannot do fully pub-sub scenario, where producer just push events to the stream, and all grains which want to read from it implicit subscribe. I may assume that this is derivative from Virtual-Actor concept as Orleans doesn't know how many instances (unique ids) of grain have, so - can't activate all of them for that namespace. Wonder - how we can work around this?

  2. Second thing is Predicate in ImplicitSubscription. It means, that it will activate grain for all namespaces which meet the Predicate condition. But when grain activate - how it can know what namespaces was triggered to explicit subscribe handler (as it has to be done) in code grain?

@sergeybykov Could you help here? :)

sergeybykov commented 5 years ago

I wonder if I can join this discussion

Everyone is always welcome to join. 😊

That means that producer has to known identities of consumers. So it seems more like Notify than Subscribe. In that way we cannot do fully pub-sub scenario, where producer just push events to the stream, and all grains which want to read from it implicit subscribe.

I'd suggest looking at implicit subscriptions the following way. Consider stream ID (a tuple of a namespace and a GUID) a topic. Any producer can produce events to the topic. Zero or more consumer grain classes can implicitly subscribe to all topics with the same namespace (with topic GUIDs mapping to instances of each grain class). Producers don't need and cannot know directly how many consumers are subscribed. So it's a semi-static pubsub where new subscribers (grain classes) can only be added by deploying a new version of the app.

Second thing is Predicate in ImplicitSubscription. It means, that it will activate grain for all namespaces which meet the Predicate condition. But when grain activate - how it can know what namespaces was triggered to explicit subscribe handler (as it has to be done) in code grain?

The grain class can implement IStreamSubscriptionObserver. Then it will be called for every new stream it is about to receive events for. So it can figure out what exact stream has activated it by looking at the argument passed to OnSubscribed() method. I just realized this feature isn't documented anywhere. 😊

darthkurak commented 5 years ago

Thanks for tip with IStreamSubscriptionObserver. I will check it. But going back to first question. Let's take example without considering if it is real case or not. Let's assume that we have ITEM grain. Each item publish event about changed quantity to the stream: namespace: "ITEM", streamId: "ITEM ID" Then we have INVENTORY grain. Let's assume that in our system we have 2 or more instances of INVENTORY (for some reason). If we implicit subscribe to ITEM namespace - we will activate INVENTORY of ID ITEM_1, ITEM_2 etc. But our INVENTORY has ID INVENTORY_1, INVENTORY_2 and we would like to activate them and subscribe to all streams from that namespace (ITEM). In current behavior we cannot do that. (implicitly)

That's why im writing that Producer has to known consumers to create appropriate stream ID equals to ID's of consumers to correctly notify them.

Even if I think of similar behavior using explicit subscription - i can't figure how to do that. For example i could create subscription explicit when creating inventory - but even then - what will trigger them when they will be deactivated?

The best would be that [ImplicitSubscription(namespace) activate all underlaying "instances" of that grain, for that namespace. I put "instances" in quota, because i know that Orleans doesn't know anything about instances. And that's the problem where i can't completely figure out how to workaround this.

I hope that this better illustrate my concerns. :)