dart-lang / sdk

The Dart SDK, including the VM, JS and Wasm compilers, analysis, core libraries, and more.
https://dart.dev
BSD 3-Clause "New" or "Revised" License
10.08k stars 1.56k forks source link

Future.value: consider completing the future immediately, without scheduling an extra microtask #49381

Open alexmarkov opened 2 years ago

alexmarkov commented 2 years ago

Documentation for Future.value says:

https://github.com/dart-lang/sdk/blob/99919c69ba1f374a8d793bcfe43e2a1702a1b82b/sdk/lib/async/future.dart#L328-L329

However, the implementation calls _Future<T>.immediate

https://github.com/dart-lang/sdk/blob/99919c69ba1f374a8d793bcfe43e2a1702a1b82b/sdk/lib/async/future.dart#L348-L349

It schedules a microtask to complete the future:

https://github.com/dart-lang/sdk/blob/99919c69ba1f374a8d793bcfe43e2a1702a1b82b/sdk/lib/async/future_impl.dart#L269-L270

https://github.com/dart-lang/sdk/blob/99919c69ba1f374a8d793bcfe43e2a1702a1b82b/sdk/lib/async/future_impl.dart#L577-L599

https://github.com/dart-lang/sdk/blob/99919c69ba1f374a8d793bcfe43e2a1702a1b82b/sdk/lib/async/future_impl.dart#L636-L641

Scheduling a microtask is slower than just completing future synchronously. It also doesn't fully follow the documented behavior (the created Future is not fully completed until the microtask runs).

A good example of users struggling with this behavior is a SynchronousFuture from Flutter:

/// A [Future] whose [then] implementation calls the callback immediately.
///
/// This is similar to [Future.value], except that the value is available in
/// the same event-loop iteration.
///
/// ⚠ This class is useful in cases where you want to expose a single API, where
/// you normally want to have everything execute synchronously, but where on
/// rare occasions you want the ability to switch to an asynchronous model. **In
/// general use of this class should be avoided as it is very difficult to debug
/// such bimodal behavior.**
class SynchronousFuture<T> implements Future<T> {

https://github.com/flutter/flutter/blob/1e14993c56f668afcd2175c7ae847599a8ec11ee/packages/flutter/lib/src/foundation/synchronous_future.dart#L7-L17

It looks like SynchronousFuture would not be needed if Future.value would complete the Future synchronously (and maybe Future.then would also avoid scheduling an extra microtask for the completed future).

In addition to Future.value, it would be nice to revise other places where the current Future implementation schedules extra unnecessary microtasks when value is already available / future is already completed.

@lrhn @mkustermann @mraleph @Hixie

lrhn commented 2 years ago

There should be no visible difference between a Future completed with a value immediately, and in a later microtask. The only way to access the value is by adding a listener, which will get notified in a later microtask anyway.

If anything, the listener is notified sooner if it happens to be added between the scheduling of the microtask in _Future.immediate an the future being completed, because then the listener is notified during that microtask, instead of in a microtask scheduled at the time the listener is added.

It is a strong promise that a future will not call the callback during the then call. It will be called only after the then call has completed (and the synchronous code leading to that call has completed as well, since there is no way to interrupt it).

The SynchronousFuture is not a valid future. Its use breaks code that assumes well-behaved futures. It must not be used in code that you intend to be production quality (or only used by code you wrote yourself, which is aware of its non-standard behavior). If you want immediate callback, don't use a Future, it's not built to work like that. 'Nuff said.

What is possible is optimizing await on a completed _Future. If we recognize that the future is a _Future and the future is already completed, then nothing in the specification prevents just continuing synchronously with the value. The specification only says that the code is suspended at the await (which usually means going back to the microtask/event loop), and some time after the future has completed, code will resume with the value of the future. That "some time after" can be immediately after, in the "next microtask" (which we just start then and there without needing to trampoline back to the microtask loop). Will probably still break code which assumes that the await 0 goes on the end of the microtask queue. We don't promise that, but we usually do it.

In short, those microtasks are often not unnecessary, and very often something people depend on. Even the tiniest change is a heck to clean up.

alexmarkov commented 2 years ago

It's not obvious why Future.then or await should necessarily postpone continuation to a separate microtask. I understand that this is the current behavior, and changing it may break certain code, but maybe such code is rare and it's okay to break it in future versions of Dart in order to make behavior more efficient and more predictable for the majority of users who just use async/await and Future.value.

Writing the code

  var future = ...;
  future.then((x) {
    foo(); // depends on bar()
  });
  bar();

and assuming that bar() runs before foo() is somewhat counter-intuitive as textually foo() precedes bar(). Rewriting this code to make ordering more explicit makes it more readable:

  var future = ...;
  bar();
  future.then((x) {
    foo(); // depends on bar()
  });

Maybe we should think about a completed Future as merely a box over a value and make sure all operations with a completed Future are always synchronous.

lrhn commented 2 years ago

The reason Future.then postpones callbacks to a later microtask is that it allows you time to prepare for the result.

If the callback can happen during the then call, then you must prepare everything before you call .then. For errors, that is essential. If you don't handle the error on the returned future, it takes down your program. You can't do that until the future has been returned. Even for values, feedback during early development of Future and Stream were that it was far too hard to write code safely if callbacks could happen immediately. Not impossible. Just improbable in practice.

The same for streams. The stream API allows you to do var sub = stream.listen(null); sub.onData((data) { ... sub.cancel(); ...});. With null safety, that way of writing an event handler that has access to the subscription became the best way to write it. Sending events during the listen call will lose events for code using that recommended approach.

The await operator is different in that it promises to do nothing until the future completes, and then it handles both values and errors. If all you ever use is await, then there is no risk in continuing immediately. However, there are lots of async primitives that cannot be written using await, precisely because it is so eager and blocking that it forces the program to be single-threaded. Not all primitives can be single-threaded (Future.wait is the canonical example, the way it's currently written actually doesn't work with SynchronousFuture. It can be made to work, it just gets more complicated.)

Maybe we should think about a completed Future as merely a box over a value and make sure all operations with a completed Future are always synchronous.

That's a very different thing from a Dart future. A Dart future deliberately tries to hide whether it's already completed, because experience tells us that if we don't, people will start writing two code paths, one for the synchronous case and one for the asynchronous case, whether they actually need the speed or not (because who doesn't want speed?) Allowing that is a non-goal, or even an anti-goal. If computation is asynchronous, it's asynchronous. Not "sometimes synchronous and sometimes not".

I'm completely fine with optimizing await to continue synchronously if its value is a completed _Future. That's safe because of the restrictions on async functions and await. I'm not OK with calling the callbacks of then synchronously in otherwise synchronous code. That's too error-prone.

alexmarkov commented 2 years ago

Okay, let's leave Future.then aside for a moment.

@lrhn Can we make Future.value synchronous? That would cut one microtask, which is a win already. Along with making await synchronous for the completed built-in futures, as you suggested, that will make the common case faster and will probably make SynchronousFuture unnecessary as it would be possible to achieve the same with just Future.value and await.

Hixie commented 2 years ago

I agree with @lrhn that the default Future shouldn't act like SynchronousFuture. The warning at the end of the API docs quoted above is not just a boilerplate disclaimer. It's REALLY CONFUSING to debug issues involving SynchronousFuture and we only use it in very specific circumstances where the performance benefits vastly outweigh the cost. (Actually it's not even the performance benefits that we want, it's specifically the synchronicity. The whole point is to be able to get data out of a potentially asynchronous operation like an image load without interrupting the synchronous build process, when the data has already been obtained and cached.)

lrhn commented 2 years ago

We can definitely make Future.value create a pre-completed future when it's given a plain value (it accepts a FutureOr, so it's not always a plain value). I think my current rewrite of _Future does that.

It won't necessarily make anything faster.

This is an optimization tradeoff. Which behavior is most common?

It probably is "one immediate listener" or "no listener", which means that making the change is going to be a tiny delay for the one listener (but also quite likely no delay at all, since it's entirely possible that no other microtasks would be scheduled between creating the future and calling then) and a saving for no listeners at all.

So, yes, we could change the behavior. (Which is why I already did that in the refactoring.) Let's try: https://dart-review.googlesource.com/c/sdk/+/250387

About making await completedNativeFuture synchronous - it's spec valid, but it's going to break a lot of tests. Every test which does await null; or await 0; to "wait for a microtask" will then not do so. It might be better, but it's different, and different timing makes tests break. Not just fragile tests (those too), but also tests that simply rely on await 0 being the same as await Future.microtask(() => 0), which it is currently specified as being. (There is a reason my Future refactoring hasn't landed yet.)

Tienisto commented 9 months ago

Regarding await:

Having an extra microtask between the return statement of the async function and the result of await is really confusing and error prone since it is normally synchronous but this is not the case for completed futures.

Let's checkout this example:

class AppState {
  final int a;
  final int b;

  AppState({
    required this.a,
    required this.b,
  });

  AppState copyWith({
    int? a,
    int? b,
  }) {
    return AppState(
      a: a ?? this.a,
      b: b ?? this.b,
    );
  }
}

void main() {
  test('Should not swallow an action', () async {
    AppState state = AppState(a: 0, b: 0);

    Future<AppState> completedFuture() async {
      // await Future.microtask(() {}); <-- UNCOMMENT THIS TO PASS THE TEST
      return state.copyWith(a: state.a + 1);
    }

    Future<AppState> regularFuture() async {
      await Future.microtask(() {});
      return state.copyWith(b: state.b + 1);
    }

    scheduleMicrotask(() async {
      // simulate race condition
      // schedule later so that reducer1 and reducer2
      // are returning in the same micro task.
      state = await completedFuture();
    });

    state = await regularFuture();

    await Future.delayed(Duration.zero);

    // This fails because there is a microtask between the return statement
    // of reducer1 and the assignment to the state.
    expect((state.a, state.b), (1, 1));
  });
}

Is there any way to get this behaviour of getting the result immediately?

I needed to add a warning in a library for developers since this issue is unsolvable right now:

https://github.com/refena/refena/commit/dd4d97a4f91f723f9ea7a99fdafda6bf5ad132bd

lrhn commented 9 months ago

@Tienisto First of all, please do not make code that is this timing sensitive. If you need to wait for something to be done, make it complete a future when it's done, don't try to count microtasks. The reason this issue is probably never going to be addressed, is that there is existing code which is so timing dependent that a one microtask less delay breaks tests. I really wish we had randomized the order of handling completed futures, at least in development mode, so you wouldn't be able to predict, and depend one, the order that unrelated futures complete in.

If you avoid writing code with specific timing dependencies, then you won't have any of these issues. Problem solved 😉 (Also much easier to test!)

That said, this is not really about returns with a future value being "synchronous" in some cases, they're never synchronous. Either the future in the return statement has not completed yet, and the async function's returned future is set to wait for it to complete and respond with its result, which will happen later, or it has completed, and the returned future is ... set to wait for it to respond with its result, which will also happen later, because getting the result of a future is never synchronous.

(And what this issue is about is whether Future.value should complete with a value immediately, and schedule a microtask when listened on, or schedule a microtask to complete later, which can then immediately, notifiy any listeners added in the meantime, which is the most common way to use futures. If they are used.)

Tienisto commented 9 months ago

Thanks for the quick response!

In my example, I only counted micro tasks to force a race condition. In a real project the user might dispatch several HTTP redux actions and one action that is a completed action (meaning there is no await). There is a small chance, that 2 functions finish in the same frame but one schedules a micro task for complete but the other does not schedule a micro task.

As you have noticed, it is about completing immediately or completing in the next micro task. Here, we have 2 different scenarios:

The second case is most common (that's why you use async functions).

There is a similar issue in async_redux documented in this comment: https://github.com/marcglasberg/async_redux/blob/e7257b24f2990f96db2efdaba19672f875fdd1ae/lib/src/store.dart#L595

Well, in async_redux, the author uses then but in my case, I have used await. Both issues are very similar (or the same?)

lrhn commented 9 months ago

Likely the same underlying issue, of code depending on a particular scheduling of microtasks.

There is no way to get a future callback immediately when a computation completes, not unless that computation uses a sync completer.

That's what an async function does, after the first await - after the await, the code is running as an async event, in response to the event that was await'ed. If the function body completes without doing any await, then the code is still running synchronously as part of the initial call, and the future hasn't been returned yet. The not-yet-returned future cannot be allowed to be completed synchronously (definitely not with an error, and whether it happens with a value or not doesn't really matter except for when a listener is invoked, which is similar in spirit to this issue.)

But, and I can't say that strongly enough, there is no promise that the future will be completed synchronously when an async function returns a value after an await. That's an implementation detail, and optimization to avoid waiting more than absolutely necessary (no need to introduce an intermediate stop on something which is just pushing a result along without doing any further computation). We'll probably not change it without a reason, because there are people who have written code depending on the current timing. (Which they shouldn't have.)

Asynchronous code does not make any guarantees other than that something will eventually happen, and some ordering promises. There is absolutely nowhere in the specification where anything is specified to happen in a specific microtask. The most precise phrase is "in a later microtask", which is slightly more precise than "at some later time". Which is to say, an async function cannot know how much time is spent during an await or during a return. The current implementation behaves in some way. That's an implementation detail. Code should not assume that nothing can get between an async function's return value; and the await of the function's future, because any number of asynchronous events are allowed to happen there. (And since the microtask scheduling can be overridden using zones, anything can happen, if someone wants it to.)

All that said, I can see how "anything can happen" is not very useful. Maybe we should make some concrete promises, for example:

It says "start" because a future may have more than one listener, and we only promise that the first one is notified immediately. Then that can trigger a whole slew of work, including completing other async functions, which triggers their listeners, before coming back to the second listener. Most futures only have one listener, and then it is actually useful.

We can actually promise that it will be later. We may even promise that it schedules that microtask using scheduleMicrotask in the zone that the async function was called in. Which it probably will. (That's again what this issue is about, whether it should eagerly schedule a microtask in the value case, or just complete the future with a value, and schedule a microtaks when the first listener is added instead. But in either case, it's a "later microtask".)

(It's not actually implemented that way, the future is not awaited inside the function, but outside, so an error ends up being the result of the function body, even if the return was inside a try/catch. But successful programs don't have errors, right?)

We could say that this implementation choice is now a promise. Which means it's something you can depend on, but also something that we cannot optimize further in the future. And if we start using JavaScript promises for our futures, we may have to do extra work to make it work the same. (That's why we don't like making promises, they can get in the way of future optimizations.)

linzj commented 8 months ago

In Alibaba's Flutter branch, we have made modifications to the DartVM. By altering the generation of the Await Stub, we first check whether the incoming parameter is a Future, then determine whether that Future has already been completed. If it has been completed, we synchronously return the value to the caller.

This modification significantly reduces the number of scheduled microtasks, thereby improving performance. Importantly, it is transparent to developers. Many plugins use this pattern: they first check whether the cached value is null, and if it is, they call another asynchronous function to fetch the value. Otherwise, they directly return the cached value.