Clarifications on async actions

mlagally commented 1 year ago

Discussion on Profile call on March 1st.: Support for asynchronous actions is necessary, but need further clarifications:

Actions do not have a way to subscribe, i.e. no eventing mechanism that tells you about a the action state. There could be an event mechanism.

Multiple invocations of actions need further clarification, does a new invocation cancel the previous one?

If invocations can queue up, this must be signaled (to the consumer) Also when the queue is full, there must be a way to communicate that.

Current implementation where a new invocation cancels the previous invocation works, i.e. no queue, works.

A queueing mechanism can consume resources that go beyond the space of a constrained device.

benfrancis commented 1 year ago

Actions do not have a way to subscribe, i.e. no eventing mechanism that tells you about a the action state. There could be an event mechanism.

I agree this would be useful, but there is currently no operation type in the Thing Description specification that would cover this. I filed https://github.com/w3c/wot-thing-description/issues/1775 last week, but any new operations will not be added until Thing Description 2.0, which is after the target publication date for WoT Profile 1.0.

Multiple invocations of actions need further clarification, does a new invocation cancel the previous one?

No. There is an explicit cancelaction operation, and nowhere does it say that invoking an action must cancel another one or that multiple action instances can not be in the running state at the same time.

If invocations can queue up, this must be signaled (to the consumer)

The pending value of the ActionStatus object indicates that an action invocation request has been received but the action has not yet started.

Note that the current asynchronous actions protocol binding is not explicitly a queue, there's nothing to say that actions can not be performed in parallel. Whether more than one action can be performed at a time may depend on the individual affordance, e.g. it's not possible to move a robot arm in two directions at the same time. I don't think whether actions are queued or performed in parallel is something that can be fixed in the profile, it is implementation specific.

Also when the queue is full, there must be a way to communicate that.

I would expect an error response to an invokeaction request (e.g. 503 Service Unavailable). This could be made explicit in an assertion.

Current implementation where a new invocation cancels the previous invocation works, i.e. no queue, works.

I would consider this to be implementation specific (i.e. down to an individual Thing). That pattern might work well for a robotic arm, but not for a printer queue for example.

A queueing mechanism can consume resources that go beyond the space of a constrained device.

This is true of many operations, not just asynchronous actions. Again, I would expect an error response in a situation like this.

lu-zero commented 1 year ago

I'd like to see if we have consensus on either behavior (implicit cancellation vs explicit error):

My proposal for the assertions:

A Web Thing MAY have a bounded internal queue of pending Actions.
A Web Thing MAY have a bounded internal buffer to store the ActionStatus of the past completed or failed actions
A Web Thing MUST remove the oldest Action result ActionStatus from its buffer to make room for a newer one.
A Web Thing MUST return an error response if the pending queue is full

Alternatively the last assertion could be:

A Web Thing MUST cancel the action being executed to make room in the pending queue if the queue is full and a new action is invoked.

We could use a vocabulary term to describe either behaviors but I'd discuss that later. I'm fine with either clarifications.

benfrancis commented 1 year ago

I'd like to see if we have consensus on either behavior (implicit cancellation vs explicit error):

I'm not sure I fully understand what you mean by this, but I don't think it's reasonable to assume that invoking an action will cancel any previous invocations of the action. That wouldn't make any sense for many use cases (e.g. a printer queue). Again, I don't think this can be a general assumption, it has to be implementation specific depending on the use case.

A Web Thing MAY have a bounded internal queue of pending Actions.

A Web Thing MAY have a bounded internal buffer to store the ActionStatus of the past completed or failed actions

A Web Thing MUST remove the oldest Action result ActionStatus from its buffer to make room for a newer one.

I don't disagree with any of these assertions.

However, I would argue these first three are really just implementation details, the first two of which are optional suggestions. Note that if there are not already two implementations (of both Producer and Consumer) which implement the actions protocol binding in this way, then the assertions are immediately at risk. The third one in particular could be problematic because being a mandatory assertion, if any existing implementations of the profile do not implement action queues in this way then they are no longer conformant with the profile. This prescriptive behaviour also arguably contradicts the Note which says that the length of time to retain ActionStatus objects is implementation specific, and may depend on application-specific requirements or resource constraints

For example, I can tell you that WebThings Gateway does not implement action queues this way, and given there would have to be approximately 7 million actions in memory simultaneously for this to be a problem (on a Raspberry Pi 3), it's not exactly a high priority to implement.

I can see however why this might be a consideration for a web thing implementation using HTTP on an ESP32 microcontroller using the WebThings webthing-arduino library.

In summary I think the first three assertions should probably not be added and these details should be left implementation specific as they currently are.

You could potentially water down the third assertion and say something like "In resource constrained environments, if the buffer becomes full then a Web Thing SHOULD remove the oldest Action result ActionStatus from its buffer to make room for a newer one".

A Web Thing MUST return an error response if the pending queue is full

I agree this is reasonable, and would even go as far as to recommend a particular response type (e.g. 503 Service Unavailable). However, running out of resources is not unique to the invokeaction operation. If we're going to mandate error responses if a Thing is out of resources then this needs to apply to all interactions. For example, you could try to write a property with a very large value which won't fit in memory, or there could be a million subscriptions to an event which overwhelm a device.

Unless we are going to detail all of these potential scenarios, it might therefore make more sense to add the "503 Service Unavailable" response to the list of recommended responses in section 6, to cover a range of similar scenarios.

A Web Thing MUST cancel the action being executed to make room in the pending queue if the queue is full and a new action is invoked.

I think this is too prescriptive. I can imagine some scenarios where this could actually be dangerous (e.g. a safety critical action which is only half completed), and such this needs to be implementation specific.

lu-zero commented 1 year ago

Let me bake a patch with this and see if we can land it soon, later can we consider adding a field with premptible and batched for the action behavior?

preemptible: last one preempts the previous invocation e.g. a fade action.
batched: enqueue invocations until full then error, e.g. a print action.

benfrancis commented 1 year ago

later can we consider adding a field with premptible and batched for the action behavior?

FYI, you would need to suggest those as new terms in the Thing Description 2.0 specification since we agreed that WoT Profile should not extend the TD information model.

lu-zero commented 1 year ago

Indeed, I'll link this on the wot-thing-description repo.

w3c / wot-profile

Clarifications on async actions #369