dart-archive / vm_service_client

A Darty client for the VM service protocol
https://pub.dev/packages/vm_service_client
BSD 3-Clause "New" or "Revised" License
12 stars 19 forks source link

StreamManager drops "streamListen" response on the floor resulting in lost events #17

Open yjbanov opened 7 years ago

yjbanov commented 7 years ago

StreamManager drops "streamListen" response on the floor. This creates a race between any action resulting in an event and the enabling of the stream. If the action completes before the stream is enabled, the event is not reported on the stream.

This affects FlutterDriver, which resumes the isolate and waits for an onExtensionAdded event. The act of subscribing to the onExtensionAdded stream kicks off an invisible background "streamListen" call, which races isolate.resume(). When resume wins the race onExtensionAdded is notified, otherwise we're never notified and time out.

This is not urgent as we have a workaround, which is to send "streamListen" on a separate channel prior to the resume/waitForExtension handshake.

nex3 commented 7 years ago

I'm not sure how to address this. The Stream API doesn't really provide any way of dealing with a possible delay between the call to Stream.listen() and the actual production of events. I suppose we could block all other requests when a streamListen is in flight...

@turnidge Is there some way we can make the protocol itself resilient to this issue? For example, streamListen could have an optional startTime parameter, which would cause the service to replay and events that happened after that time.

yjbanov commented 7 years ago

Provided there is no delay between the call to Stream.listen() and the call to onListen, one possible solution is to provide an alternative set of getters that return Future<Stream<Map>> instead of Stream<Map>. The Future wrapper would only resolve once the call to streamListen is finished and therefore it is safe to call Stream.listen() and trigger the event production immediately after.

nex3 commented 7 years ago

I don't think Future<Stream> works—if we send streamListen before Stream.listen() is called, we put ourselves at high risk of leaking listens since we don't have a clear signal of when to cancel the listen.

yjbanov commented 7 years ago

What about providing explicit methods for enabling/disabling streams? In a sense registering a listener for receiving events is orthogonal to emitting the events. I may want to keep a subscription alive, while starting and stopping event stream multiple times during the lifecycle of the subscription.

nex3 commented 7 years ago

That goes against the broader semantics of the Stream class... listen()/cancel() are explicitly designated as the signals for allocating and releasing the resources necessary for emitting events on the stream.

Another way to express this would be to add Future get StreamSuscription.ready, which would signal when a subscription is actively emitting events. @lrhn what do you think?

lrhn commented 7 years ago

I'm not sure I understand this particular problem (since I don't know StreamManager or what it's trying to do), but in general ...

A stream has a very simple state - it's either unstarted, active, paused or cancelled, with clear transitions between them:

The corresponding controller callback methods report when such a change has happened, right after it has happened. The callbacks are called immediately after changing the state, so the onListen callback is called synchronously by the listen call after changing the stream's internal state from unstarted to active, there is no delay between them.

Obscure detail: If you change the stream's state during a state callback, you will get another callback immediately after the first one. If you change the state more than once, you only get a callback if the state actually changed - so if you do pause followed by resume inside the onListen callback, you won't get another callback, because the state is still the same (active) afterwards. In general, just don't do that.

Also, a stream should be completely inactive until it receives a listen call, and a sync stream shouldn't send events immediately in a state change callback, because that can be in the middle of any other code, and most likely is in the middle of code that actually cares about the stream.

It sounds like one of the problems is that you have a complex state that is updated asynchronously. That means that nobody actually knows the current state. That's why you get a race condition when two different parts of the program both try to make an update, based on the same state, because neither of them has seen the other. I don't think allowing them to inspect each other as well as the remaining state is a good solution in the long run. It doesn't scale. I'd prefer to make a single synchronous state management that everybody has to go through, so state-check and state-update always happen on the same state. Having an async delay between check and update will never work.

As for the ready getter, I'm not sure what it's supposed to do. A stream is actively emitting events when it emits events. Should it report that the stream is no longer paused? Then making it async seems like a very unsafe approach that would definitely cause race conditions. Remember that an async event can be delivered after an arbitrary amount of computation because the microtask queue is a queue. (I still think it should have been a stack, but that's another story 😄) We already have an isPaused getter that tells you whether the subscription is currently paused. It changes immediately when the stream is paused (if you read it inside onPause it's always true, and it's always false inside onResume").

In general, we assume that whoever has a reference to a StreamSubscription is in control of it. They should know whether they have called pause or cancel on the subscription. If you want to share the same stream subscription between different parts of your code, and want them to communicate about their shared use, then I'd wrap the subscription in a helper object and share that.

We could add an isCancelled getter and perhaps even an isDone getter on the StreamSubscription, but they would be synchronous. We definitely won't add an onCancel or onPause callback to the stream subscription in general - that would just be overhead on all the stream subscriptions that don't need them. That's an easy job for helper class that contains the subscription and shares its state in some way, not something every stream subscription needs

(I put a stream of stream state changes in your stream because I heard you like streams, so now you can listen while you listen!)

Likewise, I don't know what it means to directly enable/disable a stream. Streams are passive, stream subscriptions are where the action is. Stream subscriptions have pause/resume exactly so that they can temporarily stop the production of events during the life-time of a subscription, that sounds exactly what you are asking for. The problem with pause/resume is that it doesn't work well with broadcast streams (for good reasons) and broadcast streams are widely overused. I think the thing we really need is a multi-subscription stream that isn't broadcast, one where each subscription gets its own controller.

nex3 commented 7 years ago

The issue we have here is that the transition between unstarted and active takes time. The stream is being fed by an RPC protocol, so there's inherently a delay between when the client says "I want to receive these events" (by calling listen()) and when the server will actually send events to the client. If an event happens during that delay, it doesn't reach the client, even though it technically happened after the client called listen().

The ready getter would help address this by indicating when the stream is ready to fire events. Specifically:

I don't think this is something unique to this use-case. I suspect a similar race condition exists in any stream where work outside the Dart Isolate needs to be done to start it—such as FileSystemEntity.watch().

A stream is actively emitting events when it emits events.

This isn't what Yegor needs to know, though. He needs to know when a change in the underlying state will cause the stream to emit the corresponding event.

lrhn commented 7 years ago

From the stream's perspective, there is no time between unstarted and active. The state, and the behavioral promises, of a stream is all about that stream and the events that it emits. There is no problem, for the stream, in that it takes time between you calling listen and the first event arriving. There is no stream state to differentiate between "listened but not ready to send events" and "listened and ready to send events", because there is no distinguishable behavior - it just doesn't send events until it sends events.

The problem, if I understand it correctly, is that you have an asynchronous (remote) system feeding the stream with events. You need to send a remote request for events, and that request takes time. Likewise, a pause and resume cannot immediately affect the event source because it requires asynchronous communication to forward the pause/resume requests.

It sound to me like what you want is for listen, pause and resume to be able to be asynchronous. And that you want to recognize the transition states between the call and its completion.

The suggested ready getter is really reporting that an asynchronous listen operation (via the onListen callback) has completed, and it would be more consistent (with cancel) if the listen call just returned the future directly - for pause and resume that would just work, but listen already returns a value, so it would change to FutureOr<StreamSubscription>, which would be massively disruptive to current users.

Doing that would mean that state transitions would take time. There would be states "unstarted", "starting", "active", "pausing", "paused", "resuming", "cancelling" and "cancelled". There would need to be rules for what happens if you call pause while "starting" or while "resuming". Would events get emitted during "pausing"? During "cancelling"? It would really complicate the stream model.

It's not really a surprising development that someone wants a function parameter (here the onListen) to be able to be asynchronous. Asynchrony is contagious, and sooner or later, every function parameter will need to be asynchronous for someone. However, allowing it everywhere will make everything possible, and nothing easy, so it's not good for usability. We have to make sure it's needed often enough that it's worth the overhead that it introduces when it's not needed. The future returned by cancel is worth it, IMO, because it has no alternative, and because you can ignore it because it all happens in the "cancelled" state where no further state changes can happen anyway.

Now, for listen and resume, there is an obvious solution: Report their completion as a stream event. There is no need to introduce further futures in order to deliver an asynchronous event when you are already a stream. When the stream is listened to, its first event is an "I'm ready" event. When it's paused and then resumed, the first event after the resume is an "I'm ready" event. The only problem is the typing. If the report can be introduced into the event type of the stream, then it should be good. If not, then perhaps you can use null events as markers. If you really want a ready getter, you can wrap this stream in one that captures the "ready"-events and reflects them into a future.

I don't think it's something that every stream would want to support. The overhead, including the mental overhead for the user, is just too great.

nex3 commented 7 years ago

You're making this out to be a much further-reaching change than it needs to be. No one's calling for pause() and resume() to be asynchronous. No one wants the stream to behave in any way differently in the "starting" state than it does in the "active" state. I just want onListen to be able to return a Future that's exposed to the user somehow—the rest of the Stream API can totally ignore that Future for all I care.

We have to make sure it's needed often enough that it's worth the overhead that it introduces when it's not needed.

We already have two examples of APIs used by key customers that inherently involve race conditions without this—FileSystemEntity.watch() and vm_service_client. I expect Isolate.errors does as well. Any stream that represents events (as opposed to data) over an asynchronous medium will likely have the same issue. Is your position that a naive stream implementation is incorrect for those use-cases? If so, how did naive stream implementations end up in the core libraries?

Now, for listen and resume, there is an obvious solution: Report their completion as a stream event.

This is a seriously backwards-incompatible change. All code that interacts with the existing streams will have to be rewritten. If in the future stream authors fail to notice the possibility of this race condition—which seems likely given that you yourself wrote Isolate.errors without having it emit null—they will also have to make a backwards-incompatible change.

What's more, this sucks from a UI perspective. Every single user has to handle null in every single listener they write, whether or not they actually need to wait on it to modify the underlying state. And if they do need to wait, the code necessary to handle the first event on a stream different from all the others is gnarly and error-prone.

We have a clear customer need for an improvement in the platform, one that will make it safer to use a number of existing APIs. We should add Stream.ready.

lrhn commented 7 years ago

I'm perhaps making it further reaching than this particular case needs, but that's because I try to ensure that the APIs solve problems in a way that generalizes. A finely tuned solution for a particular problem is likely to be unusable for other problems. A kludge on the side of a class might solve the current problem, but it won't necessarily solve all problems, and we'll have to keep paying for it forever. Or, in other words, either this is a general problem, and then we should solve it properly and generally, or it's not, and then we shouldn't complicate our core classes with it, but perhaps make specialized classes for that particular problem.

In this case, it is indeed a case of onListen being asynchronous, so what is the most general, and most consistent, solution for allowing that? And is it something that happens often enough that all streams must support it?

I can see the problem - if you ask for a file-watcher, you might want to wait until it's ready before you begin doing stuff. You can't see that until you get an event, and you won't get an event until you do stuff. That sounds very much like an API that should return a Future, perhaps with a "file watcher" object that contains a live broadcast stream, not one where the return value is a plain Stream. We can't fix that API (well, not right now), but that doesn't mean that we should make all streams pay for a non-generalizing or inconsistent design in one place.

Putting a field on StreamSubscription for onListen is different from what we do for onCancel, and it doesn't generalize to pause/resume. That said, pause/resume are simple because their signatures could be changed to return a future (the only problem will then be for people who get lint warnings about unhandled futures), the listen call already returns a StreamSubscription.

I don't like the ready field. It's distinct from the listen call, so discoverability is low. It's very easy to miss it. On the other hand, it's also very easy to think you should care about it, even if 98% of all streams won't use it. It's not a good general API to have on StreamSubscription, and I don't think we should add it.

I think the most consistent and general change would be to change listen to return a FutureOr<StreamSubscription>, but it's also a horrible API that I wouldn't subject anyone to (using FutureOr in a covariant position in a user-facing API is bad, it means the user needs to check it every time). So, that's a no-go, even if all existing code would keep working, because nobody actually returns a future yet.

Maybe we could add a callback to listen: stream.listen(..., onReady: () { print("READY!"); }). It's a callback, not a future, which is annoying - futures are superior to callbacks in most ways, except for the fact that you have to return them somehow, and we only have single-value returns.

Then we could add an Future<StreamSubscription> asyncListen(...normal listen params ...) alternative to listen, but why would you know to use it for those particular streams where it makes sense, and not for any other stream. That actually goes for every solution mentioned so far - most streams won't need them, so how do you, as a user, figure out when you should use them?

That leaves ... no good general solution. Which might be the correct answer - it is not a general problem, so maybe the solution shouldn't apply to all streams.

We could create an AsyncInitStreamSubscription that has the ready future, and make those particular streams return one. We could even make an AsyncInitStream interface where listen returns AsyncInitStreamSubscription, and watch could return an AsyncInitStream, not a plain Stream. Or the AsyncInitStream could have the Future<StreamSubscription> asyncListen() method instead. And there would be an AsyncInitStreamController which allows onListen to return a future, or something similar (probably a factory constructor like AsyncInitStream.create(Future onListen(StreamSubscription))).

Either way, it's proper subclassing, everything keeps working if you ignore the ready future, and you either know that you have one, or you don't care. It also avoids overhead for all the other streams

I don't particularly like adding subclasses of Stream, because users need to worry about them and understand the differences, but this is a pretty innocuous difference. You only need to care about it if you actually get one returned, and then only if you want the extra feature.

One question to answer is whether an error in the onListen future should be reflected as a stream error or an error in the ready future. The most consistent will likely be to put it into the ready future, but that will make it an uncaught error if that future is ignored (unless we make it not-uncaught, which we can if we want to). We also have to figure out whether the stream should close itself again (a done event) after an error in the onListen future, whether the error is reported in the future or the stream.

@floitschG Ideas?