stlab / libraries

ASL libraries will be migrated here in the stlab namespace, new libraries will be created here.
https://stlab.cc
Boost Software License 1.0
660 stars 65 forks source link

Executor of packaged_task not used in reduction #502

Open aaronalbers opened 1 year ago

aaronalbers commented 1 year ago

I am not sure what the intended functionality is here but it was my understanding that the executor passed to stlab::package() would be used when executing a continuation. Does a future reduction not count as a continuation? I think it did in 1.6.2 since I now have code that doesn't work when I updated to 1.7.1. Should I update my usage or was this change unintended?

BOOST_AUTO_TEST_CASE(continuation_with_custom_executor) {
    BOOST_TEST_MESSAGE("running continuation through custom executor");

    std::atomic_int check{0};
    auto func = [&check]() mutable {
        ++check;
    };

    auto pf = package<void(void)>([&](auto f){
      func(); // Should run once during the continuation
      default_executor(std::move(f));
    }, []{});
    auto f = pf.second.then([]{});
    pf.first();

    static_cast<void>(stlab::await(f));

    BOOST_REQUIRE_EQUAL(1, check); // This passes
}

BOOST_AUTO_TEST_CASE(continuation_with_custom_executor_future_reduction) {
    BOOST_TEST_MESSAGE("running continuation through custom executor with future reduction");

    std::atomic_int check{0};
    auto func = [&check]() mutable {
        ++check;
    };

    auto pf = package<void(void)>([&](auto f){
      func(); // Should run once during the continuation
      default_executor(std::move(f));
    }, []{});
    auto f = stlab::async(stlab::default_executor, []{}).then([f = std::move(pf.second)](){ return f; });
    pf.first();

    static_cast<void>(stlab::await(f));

    BOOST_REQUIRE_EQUAL(1, check); // This fails but I think it used to pass (or at least used the executor)
}

BOOST_AUTO_TEST_CASE(continuation_with_custom_executor_future_reduction_and_continuation) {
    BOOST_TEST_MESSAGE("running continuation through custom executor with future reduction and continuation");

    std::atomic_int check{0};
    auto func = [&check]() mutable {
        ++check;
    };

    auto pf = package<void(void)>([&](auto f){
      func(); // Should run once during the continuation
      default_executor(std::move(f));
    }, []{});
    auto f = stlab::async(stlab::default_executor, []{}).then([f = std::move(pf.second)](){ return f.then([]{}); });
    pf.first();

    static_cast<void>(stlab::await(f));

    BOOST_REQUIRE_EQUAL(2, check); // This fails but I think it used to pass (or at least used the executor more than once)
}
sean-parent commented 1 year ago

In the second example, I wouldn't expect your check to trigger your custom executor because that would only apply as the default executor for subsequent continuations (except it doesn't... more on that in a moment). There was a change in the implementation of reducing where it was picking up the prior default executor for the reduce continuation - it now uses the immediate executor. But the result is that the default executor of f will be the immediate executor.

When you say f = g.then(executor, lambda) that both sets the executor for lambda and the default executor for anything after lambda. I now believe the "default" should always be the immediate executor. The intent was to control grain size with a bounce to the same context as the prior task - but this is expensive and the grain size rarely justifies it. I believe it is better just to execute the task immediately, and should really be done with a trampoline so that we don't nest the stack on every immediate call.

1) I can restore (modulo all the bugs in reduce I fixed) the prior behavior of preserving a default exector on reduce (it is somewhat ambiguous which one should be preserved by I'd try to restore the prior behavior). This requires adding a then() and recover() (at least internally) that takes two executors, one for the current and one for the default. This isn't a lot of code and fairly mechanical but it would fix this unintentional breaking change.

2) Alternatively - I could just go ahead and pull the default executor and use the immediate executor - with a trampoline. But that would be a breaking behavioral change (though it would likely be an improvement for most code). To keep this from being an API-breaking change would deprecate the forms of package that take an executor - the executor would just be ignored.

Do you have a preference for 1 or 2?

aaronalbers commented 1 year ago

If I am understanding correctly then I think 2 would be the better option. It would allow for a very simple and consistent rule on executors.

When dealing with futures, executors would only be used to dispatch new tasks via stlab::async() and the thread of execution would have no possibility of changing until a .then(executor, lambda) is encountered. Did I get that right?

Would that affect cancellation in any way?

If you wanted to force the future of your packaged_task to run on a particular executor you could simply wrap stlab::package() and then chain the future with a .then(executor,[](auto x){ return x; }) or .then(executor,[]{}) (for void futures).

I agree that the immediate executor is probably the right thing to do most of the time. There is also a chance that this could simplify a lot of code. It would make executor changes rare and explicit.

sean-parent commented 1 year ago

When dealing with futures, executors would only be used to dispatch new tasks via stlab::async() and the thread of execution would have no possibility of changing until a .then(executor, lambda) is encountered. Did I get that right?

~Unfortunately, this is not the case. If the future is ready then a .then(lambda) will execute on the current thread instead of the prior task thread. The uncertainty is why I added the future default executor.~ I've been playing with a design closer to sender/receivers - and I think there may be a deterministic solution buried in there (sender/receivers are also not deterministic) if the sender is in-flight.

aaronalbers commented 1 year ago

When dealing with futures, executors would only be used to dispatch new tasks via stlab::async() and the thread of execution would have no possibility of changing until a .then(executor, lambda) is encountered. Did I get that right?

Unfortunately, this is not the case. If the future is ready then a .then(lambda) will execute on the current thread instead of the prior task thread. The uncertainty is why I added the future default executor. I've been playing with a design closer to sender/receivers - and I think there may be a deterministic solution buried in there (sender/receivers are also not deterministic) if the sender is in-flight.

I think most of the time having calls to .then(lambda) of a ready future execute in the calling thread is fine. However it would be unexpected if your future was using a serial_queue as the executor and your intent was to "queue" the action and not execute it in the calling context. Perhaps what you are queuing would take longer to execute than the calling thread is designed for.

This would mean that you would have to call .then(ex, lambda) to force the use of the desired executor. Perhaps if the new .then() will default to using the immediate_executor then we should also add a tag for dispatch that says to utilize the executor provided when the future was created. Maybe something like inherited_executor could be used so that the caller doesn't need to know what executor was used to create the future.

sean-parent commented 1 year ago

Unfortunately, this is not the case. If the future is ready then a .then(lambda) will execute on the current thread instead of the prior task thread.

I got that wrong - the continuation will always go through the executor held by the future (or the supplied executor) even if the future is ready. I ran a quick test to verify.

    auto f = stlab::make_ready_future(5, [](auto task){ cout << "invoked: "; task(); });
    auto f1 = f | [](int x){ cout << x << endl; };

Outputs:

invoked: 5
aaronalbers commented 10 months ago
auto f = stlab::make_ready_future(5, [](auto task){ cout << "invoked: "; task(); });
auto f1 = f | [](int x){ cout << x << endl; };

The continuation will go through the executor held by the future iff there is no reduction.

auto f = stlab::make_ready_future(5, [](auto task){ cout << "original_executor: "; task(); });
cout << "back on calling context" << endl;
auto f1 = f | [](int x){ cout << x << endl; };

Outputs:

back on calling context
original_executor: 5

That is expected. But if you add a reduction:

auto f = stlab::make_ready_future(5, [](auto task){ cout << "original_executor: "; task(); }).then([](int x){ return stlab::make_ready_future(x, [](auto task){ cout << "reduced_executor: "; task(); }); });;
cout << "back on calling context" << endl;
auto f1 = f | [](int x){ cout << x << endl; };

Outputs:

original_executor: back on calling context
5

I would at least expect it to use original_executor: if not reduced_executor: but it uses neither.

sean-parent commented 10 months ago

From 2.0.0a1 release I get the following output:

original_executor: original_executor: reduced_executor: back on calling context
original_executor: 5

This is basically correct, but I think should be made more efficient - here is an explanation with notes. To explain this:

stlab::make_ready_future(5, [](auto task){ cout << "original_executor: "; task(); })

Returns a future with the "original executor" as the default for continuations.

.then([](int x){ return stlab::make_ready_future(x, [](auto task){ cout << "reduced_executor: "; task(); }); });;

The continuation is executed on the original executor and prints:

original_executor: 

This is correct behavior.

The internal result of the .then() is a future<future<int>> where the outer one has the original executor and the inner is the "reduced executor".

The reduction attaches a continuation to the outer future. This executes and prints another original_executor: - I believe this is an unnecessary hop and this could be done with an immediate executor. Then a continuation is attached to the inner future and prints. reduced_executor:. This is also unnessary and could be done with an immediate executor.

Then we print back on calling context, and attach another continuation which is invoked printing original_executor: again with the result 5. This part is all correct and as expected.

I'm going to make the changes to eliminate the unnecessary executors (replace with immediate) in the above and write a unit test. I'm leaving this issue open until that lands in main.