Open dmitryelagin opened 4 years ago
cc @lrhn
This is a thorny area of the async behavior.
Futures completing doesn't necessarily use the microtask at all. One future completing can easily cascade into a number of other futures completing without going back to the microtask queue. As such, intercepting the microtask queue isn't guranteed to prevent futures from completing. If you don't look, the code works as if it was completing the futures in separate microtasks (they are non-interleaved and happening only when another microtask could happen), but if you inspect the microtask queue, then you can see that it's not being used for all of the completion events. There is a "future propagation loop" running, usually inside a single microtask.
It's also a hard to figure out which zone a microtask is scheduled in, and which it should be scheduled in.
If a future completed in zone A, and it has a listener which expects to run in zone B, should the future schedule itself in zone A or zone B (or the root zone, for good measure)?
Picking the listener is tricky, because if the future has two listeners from different zones, B and C, which one should it then schedule itself in? It only needs one schedule because it can, and does, complete both listeners in the same go. SO the choice between listeners is guaranteed to be arbitrary. That actually suggests to use the future's own (A) zone. However, there is no reason for a future to retain its zone after it has completed with a value, and I'd like to avoid that, so maybe using the A zone isn't the best choice anyway. I'm willing to use the root zone here, which definitely means that you won't be able to prevent the microtask.
Currently I believe we always use the A zone.
I am willing to look at this (read: I have been looking at it), but I haven't found a clearly superior approach yet.
tl;dr: Don't expect intercepting the microtask queue to allow you to control futures. It may or it might, and which one it is will depend on the accidental ordering of events.
@lrhn, thank you for your response! Still, I have some additional thoughts on this topic.
Although I understood the reasons for scheduling once and in the Future's own zone, it is still a very confusing behavior for users. The simplest found way to fix an issue is to wrap resolved Future with Future.value(resolvedFuture);
:
import 'dart:async';
import 'package:test/test.dart';
void noop([Object _]) {}
void main() {
group('Future should', () {
int schedulesCount;
Future<int> resolvedExternalFuture;
final externalZone = Zone.current;
final handledZone = Zone.current.fork(
specification: ZoneSpecification(
scheduleMicrotask: (source, parent, zone, f) {
schedulesCount += 1;
parent.scheduleMicrotask(zone, noop);
f();
},
),
);
setUp(() {
schedulesCount = 0;
externalZone.run(() {
resolvedExternalFuture = Future.sync(() => null);
});
});
test('schedule microtasks in handled zone', () {
handledZone.run(() {
// This will be registered in handled zone but will be scheduled
// in external zone which is confusing and will not increment counter
resolvedExternalFuture.whenComplete(() {
print('external: scheduled');
});
// This is a workaround to force the callback to be scheduled
// within handled zone
Future.value(resolvedExternalFuture).whenComplete(() {
print('workaround: scheduled');
});
});
expect(schedulesCount, 1);
});
});
}
The response will be:
workaround: scheduled
external: scheduled
✓ Future should schedule microtasks in handled zone
Exited
However, this can't be used, when the problem begins and happens in external code and before I can somehow influence it. This leads me to think that the root of the problem is the Future static constants. In the case of the first test of thread - Future._nullFuture
.
The nullFuture
is made before any code has a chance to make Future or Stream, and that's why it is bound to root zone out of the box. But it means that I can get Future with root zone even when all of my Futures or Streams were made and processed in my single custom zone. This fact is pretty confusing and leads to behavior from the first test.
IMO, something like _Future<Null>.value(null);
instead of nullFuture
(in places like this) can completely resolve the whole special case of this thread.
Not using a reusable nullFuture
will definitely change some things. Maybe not to what you expect (you never know what zone the plumbing code that propagates future results run in).
It would make optimizations like https://github.com/dart-lang/sdk/blob/master/sdk/lib/async/stream_impl.dart#L211 impossible (or harder), though.
@lrhn
It's also a hard to figure out which zone a microtask is scheduled in, and which it should be scheduled in.
It is true that the Zone
API is very flexible and you can write code where it is indeed hard to figure out what runs in which zone. However, without any discipline zones become quite useless. Currently, most major frameworks - Flutter, AngularDart, Rx, Quiver - rely on the following two properties:
The first property seems to hold in Stream.listen(onDone)
. However, it violates the second property.
This is very surprising given that Future.then
, Future.delayed
, Time.run
, scheduleMicrotask
, Stream.listen
, Window.onDrawFrame
(or any Window.on*
callback in Flutter), and many more all uphold these properties. This allows them to be combined and chained in arbitrary ways without leaving the zone.
We use this to implement a number of useful features:
WidgetTester
creates a fake timeline (using Quiver's FakeAsync
) allowing you to run tests that span many seconds in logical time in milliseconds of real time. You can even write a widget test that animates for 10 years and run that test in less than a second.NgZone
will result in UI to stop updating.It is therefore very surprising to see Stream.listen(onDone)
to kick you out into the root zone. I think this is a serious bug.
Here are some bug reports associated with this issue:
https://github.com/google/quiver-dart/issues/583 https://github.com/flutter/flutter/issues/40074 https://github.com/flutter/flutter/issues/17738 https://github.com/ReactiveX/rxdart/issues/365
I don't think we've ever promised that "The receiver of an async callback schedules microtasks and timers in the current zone."
In fact, I don't even know what it means. The zone that is current when?
I would read it as saying that Future.then(something)
should schedule microtasks related to something
in the zone that is current when then(something)
is called. It currently doesn't, and again, if there is more than one listener, then it won't use both. And if there is no listener, then the option does not exist, but we'll still schedule something to complete the future asynchronously (at least if it is an error).
For callbacks of Future
and StreamSubscription
we ensure that the callbacks remember the zone they were added in, and run in that zone.
It's not given which zone the underlying controller/future runs scheduleMicrotask
in.
For Future
it will likely be the zone where the future was "created" (which is not a specified concept, but when you call Future.then
, you create the result future in the current zone of that call). The future actually only holds on to that zone in order to use its scheduleMicrotask
, and I'd like to avoid that memory tax. And again, not all future completions use scheduleMicrotask
, some just piggyback on other futures being completed, so we can complete multiple futures in a single microtask.
A stream will "run" in the zone where the listen
was called. We just call the onListen
in the zone where listen
was called, and let the controller take it from there. That usually ensures that the events will come from that zone, but the stream controller actually mostly ignores zones, it's the stream subscription which switches zones when it calls its callbacks. So, schedule microtask will happen in the zone that the controller is invoked in. (For pauses and resumes, it's probably the zone that the calls to the stream subscription happens in).
The issue here seems to be that the onCancel
call returns a future, and somehow that future's zone is used to schedule a microtask (because it's a completed future). There is nothing we can really do to prevent this without changing the behavior of completed futures (to, say, use the listener's zone).
We can try that, but it is very likely that a lot of other unstable code will break then.
@lrhn
I don't think we've ever promised that "The receiver of an async callback schedules microtasks and timers in the current zone."
It doesn't have to be a promise, but the default. If nothing in the documentation explicitly states that an API is shifting zones then the expectation is that it won't move my code to another zone. Any other behavior leads to surprises. After all, the Zone
API is all about being able to intercept timers, microtasks, and other operations. In order to succeed zone behavior must be predictable. If some API doesn't openly scream "I'm messing with zones" (such as FakeAsync
does in both the class name and in the docs) but then goes ahead and messes with zones, it makes the zone API too unpredictable and confusing to be useful in practice.
So far the vast majority of core libraries has been well-behaved. This instance is one of the rare exceptions, which makes it so surprising.
In fact, I don't even know what it means. The zone that is current when?
All code runs in some zone, if anything, the root zone. Consider a block of Dart code (a series of statements). Whatever zone the first statement ends up running in (the "current zone"), then, by default, all other statements should run in the same zone, unless there's something that very clearly shifts the zone. In the example above the statements inside onDone
unexpectedly end up in a different zone:
print(Zone.current.hashCode);
Stream<Object> // zone 1
.periodic(const Duration(hours: 1)) // zone 1
.take(1) // zone 1
.listen( // zone 1
(_) {
print(Zone.current.hashCode);
print('onData'); // zone 1
testCounter += 1; // zone 1
},
onDone: () {
print(Zone.current.hashCode);
print('onDone'); // zone 2!!!
testCounter += 1; // zone 2!!!
},
);
// zone 1
This is unexpected because there's nothing about producing a periodic stream of events that implies zone changes. It's a pure in-memory computation that relies on microtasks and timers, all of which are interceptable by zones.
Edit: added Zone.current
printouts that shows zone changes.
We definitely do promise that the callbacks to Future
's then
and catchError
methods, and to Stream
's listen
method will run in the zone where that call was made. If they don't, that's a bug.
The onListen
etc. callbacks on a stream controller happens in the zone where the user called listen
, pause
, resume
or cancel
, not where the callbacks were set. (That too is a promise, and it's what guarantees that a stream listen operation actually happens in the zone where the listen
method was called).
I honestly don't understand the connection between a Future
and a Completer
and whatever zones are involved. The Future
has a zone before it's completed. That never made sense to me.
Everything else is less specified. We do not guarantee where the implementation runs, which does affect any microtasks it might schedule.
We definitely do promise that the callbacks to
Future
'sthen
andcatchError
methods, and toStream
'slisten
method will run in the zone where that call was made. If they don't, that's a bug.The
onListen
etc. callbacks on a stream controller happens in the zone where the user calledlisten
,pause
,resume
orcancel
, not where the callbacks were set. (That too is a promise, and it's what guarantees that a stream listen operation actually happens in the zone where thelisten
method was called).
This is both surprising and undocumented. Surprising, because what would the author of the onListen
callback, which is conceptually "on the other end of listen
", know about the behavior of the zone where listen
runs. Imagine two libraries each running in its own zone, ending up at two ends of a stream, one at the listener side, and another at the stream controller side. These two libraries should be able to have control over zones their code runs in. The stream should establish a safe connection between the two zones without entangling them.
I honestly don't understand the connection between a
Future
and aCompleter
and whatever zones are involved. TheFuture
has a zone before it's completed. That never made sense to me.
I don't think Future
needs to have a zone in a way that leaks to the call site (i.e. code that creates and waits for the future).
Everything else is less specified. We do not guarantee where the implementation runs, which does affect any microtasks it might schedule.
It is actually fine for the implementation to run in a different zone, as long as it has a good reason to do so and has good docs. Stream's onDone
doesn't have either.
Microtasks and timers are special, because ZoneSpecification
has explicit handling for them. Hence the expectation that if an implementation schedules microtasks/timers, the zone should be able to intercept them.
I admit that the zone interaction with streams and futures is almost entirely undocumented.
The one thing that you can depend on is that only callbacks which are registered in a zone will also be called in a specific zone. That's the then
/catchError
/whenComplete
callbacks of Future
and the onData
/onError
/onDone
callbacks of StreamSubscription
. These are the callbacks which correspond to the asynchronous computation.
All other callbacks are not zone aware, and will be called in whatever zone is current at the time they are triggered. These are control callbacks for streams (onListen
/ onPause
/ onResume
/ onCancel
) and all callbacks called from stream builder functions like Stream.where
and Stream.map
. The former is called in the zone where the user called stream.listen
/subscription.pause
, the latter only gets attached when the modified stream is listened to, so it's in the same zone as the listen
call.
Making it possible to intercept microtasks/timers/etc. in a somewhat predictable way is actually why the stream controller's onListen
runs in the zone of the stream's listen
call. Everything below the listen
call is considered implementation, and this way the implementation that creates stream events on a stream subscription, and schedules them, will run in the zone where the stream was listened to.
Stream objects themselves are passive, they have no computation behind them, and therefore they have no zone. The computation belongs to the stream subscription which is created when you listen
to the stream. That's why the even creation code runs in that zone, so you do get to control how microtasks are used.
Zones are still a hot mess in many other ways, but I think that particular choice is reasonable.
It does mean that if you move a stream subscription into a different zone and add listeners there, the code providing events and the code consuming events will no longer be in the same zone, because setting a new listener with subscription.onData(handler)
will remember the current zone and run the handler
callback in that zone.
One big confuser here is that a Future
contains the zone of its "computation" even when all you did was create a completer. The future belongs to that zone after the completer was created, and completing it from a different zone can get weird (well, maybe, I can't say what happens without checking the code).
The other issue is that sometimes the future implementation needs to schedule a microtask. It has so far consistently used the zone that the future was created in. I'd be fine changing that to the zone of the (well, a) listener when there is a listener. If there isn't, it's an error future with no current listener, and we don't generally share those, so the current zone is likely fine.
It is a change, and since people keep depending on the behavior we have, whether promised or not, it could be a breaking change.
(I've wanted to change Future
for a while, so that the zone of the future depends on the place the completer is completed, not where it's created, which would also avoid any need to store a zone in a non-error future. I also want to remove the concept of error zones where errors cannot propagate across boundaries, because the concept is so ill-defined that I can't tell you what it means, and all it ends up doing is having some futures surprisingly never complete.)
Hi is there any updates? This indeed makes fake_async quite useless...
No new plans. The fake_async
package has some usability issues that make it very easy to have deadlocks or just no progress, but that's not entirely because of this issue. It's inherently tricky to use correctly.
The constant null
future is a problem, and there is a fix for it, but that fix actually breaks some tests that rely on the current behavior of running some code in the root zone. I gave up on landing that change once. I'm willing to look at it again, but don't expect a different outcome.
StreamSubscription.cancel
currently returns the Future._nullFuture
. I would expect that when calling cancel
in a particular zone, this returns a future that is created in that zone and therefore uses the zone's scheduleMicrotask
. Because this is not the case, the fake async zone cannot control this future and calling flushMicrotasks
will not execute any registered callbacks. This behavior often causes tests with fake async not to complete, which is very difficult to debug.
I don't know how important the use of the Future._nullFuture
is for the overall performance, but I would expect that public api's don't return this unless called from the root zone.
The _nullFuture
doesn't exist for performance reasons, but for timing. It exists because the cancel
method could return null
prior to null safety, which was interpreted to not introduce an async delay.
The recognizable _nullFuture
is recognized in the same places to not introduce a delay, keeping the same timing as before.
It does that in the root zone, which was actually not always true. It used to be created lazily in whichever zone first accessed it. Using any one zone is a problem, because it is returned from code running on other zones than the root zone.
Changing that to, fx, a null future per zone now breaks bad code that relies on that future being in the root zone. If we can find and fix that bad code, somehow, maybe we can fix it.
Can you show me an example of such bad code? I'm interested to learn more.
Can you show me an example of such bad code?
We have not tracked it down to a minimal reproduction case.
I think I remember one example using fake_async
which depended on one particular future being in the root zone, and therefore running even when the fake-async zone was not being driven forward.
Code triggered by that future would then call elapse
on the fake-async context.
If the future changed to be inside the fake-async zone, then nothing happened.
In some cases, the control over the Dart zone can be limited when we use Future or Stream because some inner constants are tied to the root zone.
For example (inspired by FakeAsync from Quiver):
It is expected that the test should pass, but it fails:
We found this on Dart SDK 2.4.0, it also can be reproduced with Dart SDK 2.7.0.
Before it fails,
.take(1)
closes stream which cancels the subscription. In the current example, cancellation returns Future which is resolved withnull
in the root zone. It is constant. https://github.com/dart-lang/sdk/blob/master/sdk/lib/async/future.dart#L151 TheonDone
callback will be called after cancellation Future resolving, but it is already resolved. So, it will be called withscheduleMicrotask
from the zone of the resolved Future, which is the root zone. And here we lost control overonDone
and why the test failed.Unfortunately, this is a very popular fail case in tests, where Quiver, RxDart and other Stream-heavy libraries are used and when we try to test something synchronously.