brycelelbach / wg21_p2300_execution

`std::execution`, the proposed C++ framework for asynchronous and parallel programming.
Apache License 2.0
19 stars 6 forks source link

Create new section with receiver contract and lifetimes of sender/receiver #27

Open ericniebler opened 1 year ago

ericniebler commented 1 year ago

Discuss complete life-cycle of an async operation, including the required lifetimes of scheduler, sender, receiver, op state, and queries with relation to each other.

ericniebler commented 1 year ago

Also, rename "receiver-protocol" to "async-function-protocol" or something else indicative of the fact that it relates to the interrelated responsibilities of conforming schedulers/senders/receivers/operation states.

villevoutilainen commented 1 year ago

The validity of a sender returned by schedule(sched) does not depend on the validity of the scheduler.

Well.. in the sense that the scheduler might be just a handle to something longer-lived, like it is in the Qt adaptation of QThread, sure, the scheduler doesn't need to be kept around, since it's just a separate thin non-owning wrapper. If the work-runner just models scheduler, you likely want to keep the scheduler around. :)

An operation state shall not be moved or copied once start() has begun executing.

I'd rephrase this - the operation state cannot be moved or copied, operation states prevent this by not being copyable or movable.

ericniebler commented 1 year ago

If the work-runner just models scheduler, you likely want to keep the scheduler around.

I disagree. I don't want generic code to have to worry about keeping the scheduler around after it has requested a scheduler-sender. If the scheduler owns resources, the scheduler-sender would have to share that resource by, e.g. reference counting it. EDIT: The idiomatic way of doing this, however, is to put the resources into a context and make the scheduler be a handle type.

operation states prevent this by not being copyable or movable.

Nowhere is a movable operation state prohibited. One could move one as much as one liked but not after start() is called.

The library-provided ones generally have internal pointers, and so yes, they are immovable.

villevoutilainen commented 1 year ago

If the work-runner just models scheduler, you likely want to keep the scheduler around.

I disagree. I don't want generic code to have to worry about keeping the scheduler around after it has requested a scheduler-sender. If the scheduler owns resources, the scheduler-sender would have to share that resource by, e.g. reference counting it. EDIT: The idiomatic way of doing this, however, is to put the resources into a context and make the scheduler be a handle type.

Pardon me, but that's some hot nonsense. Generic code wouldn't have to worry about keeping a scheduler around, because how and from where to get access to a scheduler is non-generic to begin with. But if you want to tell people "you don't need to keep any scheduler around in any of your code once you've gotten a sender from it via 'schedule()'", you're going to tell them how to crash their programs. I find it extremely icky to suggest that the scheduler-sender should share resources. If you have a thread pool that models scheduler, of course you're going to keep the thread pool around while it runs work, and you're damn straight not going to share its queues and other resources in the senders it dishes out. That would also then mean that those queues are going to be shared with the operation states that the senders eventually spit out. Such amount of sharing is just barking mad, when to avoid that, all you need to do is keep a thread pool around while it's running work.

In other words, I don't think this model has a guarantee that any object modeling scheduler can just be destroyed after it has given you a sender from schedule(). And that will bother zero users, because many of them want to schedule() again to do more work, they want to transfer() other senders to that scheduler, they want to on() work onto that scheduler, they want to repeat_effect_until() on a scheduler. There are boatloads of use scenarios where you will want to keep a scheduler around anyway.

operation states prevent this by not being copyable or movable.

Nowhere is a movable operation state prohibited. One could move one as much as one liked but not after start() is called.

The library-provided ones generally have internal pointers, and so yes, they are immovable.

Right. So generic algorithms that deal with operation states have to accommodate non-movable operation states. No generic code can assume that an opstate is movable. That makes it an algorithmic requirement, or dare I say it.. ..a part of the concept.

villevoutilainen commented 1 year ago

Look at it this way: I could've made QThread a scheduler directly, instead of adding a factory function that returns a scheduler internally pointing to it. If you're going to tell me that the senders produced by such a scheduler need to share ownership of the QThread, I'm going to stop using this model right then and there, because all generic uses work just fine when the senders produced by the QThread-scheduler just point to the QThread, without any reference-counted shared ownership of the QThread in the senders. So please don't retroactively introduce such reference-counting madness, it's not necessary, and has a completely unnecessary cost in both efficiency and in complexity.

ericniebler commented 1 year ago

I don't think that's what I said. From what you just said, it sounds like you have a scheduler that points to a QThread. No reference counting is needed here.

If your scheduler owns a QThread such that destroying the scheduler invalidates any senders obtained from it, then that is a problem. The QThread should live in a context and the scheduler should refer to it.

villevoutilainen commented 1 year ago

Or to explain differently: you're concerned that some generic code would need to hold on to a scheduler unless there's a guarantee that the scheduler doesn't need to be kept around. I'm saying that no generic code needs to do that, and I'm saying that it's harmful to provide a guarantee that the scheduler can just be tossed to the bin. No generic code requires such a guarantee, no sender algorithm needs to destroy a scheduler, so the lifetime of the scheduler is outside the realm of the important bits of this model. But if we shoehorn that lifetime into the model, we suddenly make modeling a scheduler and its senders much harder, without any algorithmic need for it. In other words, your concern is unfounded, whereas the guarantee you suggest brings another concern for me that some code can be written that relies on that guarantee, which leads to unwarranted sharing in places where we don't need it.

To explain it yet differently, there is no algorithms that require that a scheduler can be destroyed once it's done a schedule(). Algorithms that would require it are harmful. This guarantee is unnecessary.

villevoutilainen commented 1 year ago

I don't think that's what I said. From what you just said, it sounds like you have a scheduler that points to a QThread. No reference counting is needed here.

If your scheduler owns a QThread such that destroying the scheduler invalidates any senders obtained from it, then that is a problem. The QThread should live in a context and the scheduler should refer to it.

I could write it so that QThread is the scheduler. If you destroy the QThread and you have senders pointing to it still running work, that's a user error, and the model shouldn't perform magical acrobatics to protect against that.

I don't see how the "context", however that is modeled, helps. You're just moving the cheese, saying that the context must outlive the senders. It's no more difficult to have the scheduler outlive the senders. And that bothers none of your generic code.

ericniebler commented 1 year ago

It's no more difficult to have the scheduler outlive the senders. And that bothers none of your generic code.

There is a big difference. A scheduler is something I pass around by value and copy freely. All copies are equal, meaning they all schedule onto the same context. That's what copy_constructible in the scheduler concept means.

Your QThread is your context. You should have a function to return a scheduler from it.

villevoutilainen commented 1 year ago

It's no more difficult to have the scheduler outlive the senders. And that bothers none of your generic code.

There is a big difference. A scheduler is something I pass around by value and copy freely. All copies are equal, meaning they all schedule onto the same context. That's what copy_constructible in the scheduler concept means.

Your QThread is your context. You should have a function to return a scheduler from it.

You can pass schedulers around by value just fine. But that doesn't mean the model guarantees that all schedulers can be destroyed willy-nilly. There's no algorithm in libunifex or P2300 that requires that. That's what I'm objecting to, and we shouldn't provide such a guarantee when nothing needs it. You can simply omit this part in the lifetime description, and that does no particular harm. If you want, you could say that scheduler-senders don't (need to) own their scheduler, nor do they (need to) own their context.

villevoutilainen commented 1 year ago

Sigh, I mean, fine, we do want the guarantee that those copies can be destroyed. :) Fine, I guess it's saner the way you described it originally. I suppose it would help if we then explain that there should be a long-lived context that the schedulers and senders point to, without owning it, and we certainly should never talk about reference-counted shared ownership in any of this, that was certainly a serious red herring.

villevoutilainen commented 1 year ago

So, now that we have cleared my stupid confusion, how about we describe this thus, exact wording to be honed:

  1. There is a platform-specific or program-defined execution context (that is not specified by this section) that schedulers refer (point) to, schedulers do not own that context. The context outlives the schedulers.
  2. Senders produced by a schedule() operation refer (point) to the same context, and don't refer to the scheduler. Therefore the scheduler doesn't need to outlive said senders, and again, the context outlives the senders.
  3. Not all senders refer to a context. For example, a sender produced by just() doesn't refer to an execution context.
ericniebler commented 1 year ago

I could have a non-normative note that most schedulers point to a context owned elsewhere and that it must outlive its schedulers and the senders obtained from them. That isn't true for e.g. the inline_scheduler.

lewissbaker commented 1 year ago

I think perhaps we could define the concept of "scheduler validity". A scheduler has handle/reference semantics similar to that of an iterator. A scheduler can become invalid if the lifetime of the execution resources that it uses ends.

We could then say that a sender produced by calling schedule(sched) and the corresponding operation state produced by calling connect() on that are also invalidated whenever the scheduler used to produce them would be invalidated.

Then we just need to say what things are valid to do with an invalidated scheduler/sender/operation-state. e.g. do we only allow destruction? or do we want to require that destruction of (some of) these things "happens before" the scheduler is invalidated?

lewissbaker commented 1 year ago

The validity of a sender returned by schedule(sched) does not depend on the validity of the scheduler.

I think here we want to say something like the validity of the sender does not depend on the lifetime of sched - but I think the returned sender would inherit the validity properties of the scheduler. i.e. it's as-if the returned sender held a copy of the scheduler handle.

The validity of an operation state returned by connect(snd, rcv) does not depend on the validity of snd or rcv.

Again, I think we want to say here that the validity of the operation state does not depend on the lifetime of snd and rcv. The operation-state is assumed to decay-copy the passed receiver, and similarly is assumed to decay-copy any state it needs from the sender into the returned operation-state.

The validity of the operation-state will then inherit the validity properties of the sender passed to connect. i.e. if the sender would have become invalid when some other resource was destroyed then the operation-state will also become invalid when that resource is destroyed.

Some senders will be unconditionally valid (i.e. their validity is not tied to the lifetime of some other resource), and some senders will be conditionally valid (e.g. a schedule-sender that holds a reference to a QThread-based context).