Closed mlagally closed 3 years ago
Discussion in arch call on 17.6.
Needs a hypermedia format for an action model. See Stevens – Unix network programming WebThings – Hypermedia format, consider for adoption? https://iot.mozilla.org/wot/#actions-resource Ege – Robot arms – delays …
Actions can return a JSON object with multiple fields: containing "id", “status” and “cancel” endpoints and a notification endpoint to which to subscribed to for status change notifications.
Cancel and notification could be optional?
TD needs a way to communicate action capabilities, e.g. cancellable, notification support etc.
Output data schema of actions describe the capabilities, i.e. if they don't define a cancel endpoint an action is not cancellable.
Failure responses – protocol independent
Action status: Success, failed, ongoing, (not responding – on a gateway / proxy)
Link an action to a status object and an event endpoint? TD has no links.
An action can return a "status" object, which can be used to query (i.e. poll) whether the action has been completed and returns the result.
The caller has only "read-only" access on this object - there's no way to cancel an action using this object. If an action should be cancellable, a separate "cancel_" can be defined by the TD, which does the right thing.
As I understand it this would mean that all cancelable actions would require two separate interaction affordances in a Thing Description, e.g. fade
and cancel_fade
? What is the rationale for this?
WebThings – Hypermedia format, consider for adoption? https://iot.mozilla.org/wot/#actions-resource
To explain, the way that this works in the Web Thing REST API is that a POST
on an Action
resource to invoke an action responds with the URL of a dynamic ActionRequest
resource. That ActionRequest
resource can support a GET
to query its status and a DELETE
to cancel the action.
Invoke an action
POST https://mythingserver.com/things/lamp/actions/fade
Accept: application/json
{
"fade": {
"input": {
"level": 50,
"duration": 2000
}
}
}
201 Created
{
"fade": {
"input": {
"level": 50,
"duration": 2000
},
"href": "/things/lamp/actions/fade/123e4567-e89b-12d3-a456-426655"
"status": "pending"
}
}
Query the status of an action
GET /things/lamp/actions/fade/123e4567-e89b-12d3-a456-426655
Accept: application/json
200 OK
{
"fade": {
"input": {
"level": 50,
"duration": 2000
},
"href": "/things/lamp/actions/fade/123e4567-e89b-12d3-a456-426655",
"timeRequested": "2017-01-25T15:01:35+00:00",
"status": "pending"
}
}
Cancel an action
DELETE /things/lamp/actions/fade/123e4567-e89b-12d3-a456-426655
204 No Content
A GET on the top level Action resource returns a list (queue) of all the pending ActionRequest
resources corresponding that action.
List action requests
GET /things/lamp/actions/fade
Accept: application/json
200 OK
[
{
"fade": {
"input": {
"level": 50,
"duration": 2000
},
"href": "/things/lamp/actions/fade/123e4567-e89b-12d3-a456-426655",
"timeRequested": "2017-01-25T15:01:35+00:00",
"status": "pending"
}
},
{
"fade": {
"input": {
"level": 100,
"duration": 2000
},
"href": "/things/lamp/actions/fade/123e4567-e89b-12d3-a456-426655",
"timeRequested": "2017-01-24T11:02:45+00:00",
"timeCompleted": "2017-01-24T11:02:46+00:00",
"status": "completed"
}
}
]
This approach doesn't require two separate interaction affordances per action.
I would suggest something along these lines for async actions in the Core Profile. There is a proposal in https://github.com/w3c/wot-thing-description/issues/302#issuecomment-802159857 regarding how to represent some of these types of operations (queryaction
, updateaction
and cancelaction
) in a Thing Description, but for the Core Profile this could be simplified via defaults.
The payload of the responses could be simplified from the Web Thing API by removing the object wrapper with the name of the action, since this is not strictly needed. (The reason it's there in the Web Thing API is that there's also a top level Actions
resource which provides a queue of actions of all types, which uses the same payload format so needs to distinguish between action names).
For synchronous actions I assume that there would be no dynamically created ActionRequest
resource, so the WoT producer could just respond to the invokeaction POST
request with a success/failure status of some kind directly. But what happens if the action hasn't completed by the time the HTTP response comes back? The HTTP request may time out but the action still continues and eventually completes regardless, and the consumer would have no way to know what happened. One option would be to only define asynchronous actions and require that all implementations support that.
So my thought here is that the affordances for cancel, etc. would not have to be in the TD. This is hard anyway for dynamic resources. The original idea of the hypermedia approach (first proposed something like three years ago, and note it is in our charter to better nail it down) was that an "Action Description" would be returned by an action invocation and it would have a set of links in it for (dynamic) interactions that could be done to follow up on an action invocation. At a minimum support for checking status, requesting cancellation (if possible, so would be optional) and subscribing to a notification of a status change (also optional, just in case the endpoint can't deal with events, but the alternative is polling the status which is not efficient).
Anyway, the original proposal was to use a special case of a TD as an "Action Description" which would indeed allow a lot of flexibility, but would also be complicated. So my proposal is to keep things simple and just return a JSON object from an action invocation which would have a set of pre-defined entries. To make this concrete, when you invoke an action the "output" object (which, BTW, would be described in the TD's "output" data schema for the action) would look something like
{
"id": <a per-action-invocation unique value>,
"status": <a url to GET a status value, which would be one of a small number of states>,
"cancel": <a url to POST to to cancel an action; optional; if omitted, the action would not be cancellable>,
"notify": <a url to subscribe to notifications of status changes>
}
We would prescriptively define in the profile spec how each of these in turn would work (replacing a TD-like Action Description, basically, with normative specifications). For instance, for "status" we would indicate what values could be returned (one of a small set of strings, for instance) and how the protocol would work ("GET" on HTTP, for instance). Same for Notification. Note that you would be able to see from the TD whether or not an action is cancellable, etc. just by looking at the output data schema.
We could write a Thing Model for Actions to define all this if we wanted to get fancy but would not require the Thing to return it.
HOWEVER, in the meeting we all agreed that we should definitely start with the low-hanging fruit here and start by a least defining synchronous actions. Then only once that is done should we look at how to deal with async actions (and that means we need some way to distinguish the two).
We also discussed a number of alternatives to the above, but cluttering the TD with a bunch of extra properties and events for each action does not not really seem like a good idea. We also thought that maybe additional "ops" for actions like "notify" and "cancel" might go into the TD spec later, and wanted something consistent with that (possible) evolution of the TD. Taking that approach in the profile spec now though is not feasible.
@benfrancis BTW, I admit to typing up the above before reading all the details of your posts (I only had 5m between meetings). Skimming what you posted it seems we might be close to being on the same page. I will read your posts more carefully and post a followup soon.
Please also check https://github.com/w3c/wot-thing-description/issues/899
From my point of view:
Also regarding the very first comment: https://github.com/w3c/wot-thing-description/issues/890
We discussed a proposal during the vF2F, slides are here: https://github.com/w3c/wot/blob/main/PRESENTATIONS/2021-06-online-f2f/2021-06-30-WoT-F2F-Action%20Semantics.pdf
Below is a sketch of a proposal for how the action operations could work in the Protocol Binding section of the WoT Core Profile.
Note: I could personally live without the updateaction
operation, since for many use cases simply sending a second follow-up action request could fulfil the same purpose.
This proposal includes support for both synchronous and asynchronous action status responses. My suggestion is that web things can choose which type of response to send. Consumers MUST accept both types of response to the initial invokeaction
request, but support for the other operations could be made optional.
invokeaction
POST /things/lamp/actions/fade HTTP/1.1
Host: mythingserver.com
Content-Type: application/json
Accept: application/json
{
"level": 100,
"duration": 5
}
See https://github.com/w3c/wot-profile/issues/81#issuecomment-880619349 for a proposal of how this could work in the Core Profile.
A web thing can either respond to an action invocation request synchronously with a 200 OK
response containing an ActionStatus
object, or respond asynchronously by responding with a 201 Created
response with the URL of an ActionStatus
resource in the Location header.
ActionStatus
objectAn action status object contains:
pending
running
completed
failed
HTTP/1.1 200 OK
Content-Type: application/json
{
"input": {
"level": 100,
"duration": 5
},
"status": "completed"
}
If there's an error carrying out the action, the server MUST return an error response (e.g. 400 for invalid parameters or 500 for a failed actuation). E.g.
HTTP/1.1 400 Bad Request
Content-Type: application/json
{
"input": {
"level": 101,
"duration": 5
},
"status": "failed",
"error": {
"type": "https://mythingserver.com/docs/errors/invalid-level",
"title": "Invalid value for level provided",
"invalid-params": [
{
"name": "level",
"reason": "Must be a valid number between 0 and 100",
}
]
}
}
HTTP/1.1 201 CREATED
Location: /things/lamp/actions/fade/123e4567-e89b-12d3-a456-426655
queryaction
If a web thing responds with a link to an ActionStatus
resource, a consumer can poll that resource to get the current state of the action.
GET /things/lamp/actions/fade/123e4567-e89b-12d3-a456-426655 HTTP/1.1
Host: mythingserver.com
Accept: application/json
The web thing responds with an ActionStatus
object.
HTTP/1.1 200 OK
Content-Type: application/json
{
"input": {
"level": 100,
"duration": 5
},
"status": "running"
}
updateaction
In order to update a pending or running action, a consumer can send a PUT
request to its ActionStatus
resource URL with new input data.
PUT /things/lamp/actions/fade/123e4567-e89b-12d3-a456-426655 HTTP/1.1
Host: mythingserver.com
Content-Type: application/json
Accept: application/json
{
"level": 50,
"duration": 5
}
If the action request is successfully updated, the web thing responds with an updated ActionStatus
resource.
HTTP/1.1 200 OK
Content-Type: application/json
{
"input": {
"level": 50,
"duration": 5
},
"status": "running"
}
Otherwise it may respond with an error code (e.g. if the action request can not be updated or has already completed).
cancelaction
In order to cancel an asynchronous action a consumer can send a DELETE
request to its ActionStatus
resource URL.
DELETE /things/lamp/actions/fade/123e4567-e89b-12d3-a456-426655 HTTP/1.1
Host: mythingserver.com
If the action is sucessfully cancelled then the web thing responds with a 204 response.
HTTP/1.1 204 No Content
otherwise it may respond with an error (e.g. if the action can't be cancelled or has already completed).
Note that ActionStatus resources are not expected to persist forever so may be stored in volatile memory by a web thing and/or cleaned up on a regular interval.
Discussions around how to describe these types of action operations canonically in a Thing Description are continuing in https://github.com/w3c/wot-thing-description/issues/302. For the purposes of the Core Profile we don't necessarily have to wait for those features to be added to the Thing Description, we could just expect Thing Descriptions to provide a single URL for an action affordance and apply the above set of operations as defaults. If and when the Thing Description specification catches up, we can provide an informative example of a canonical Thing Description describing these operations.
Edit: One thing that's missing from this proposal which we could add (and is already supported in WebThings), is an additional operation to enumerate the list of action requests in an action queue using a GET
request on the action URL.
I like @benfrancis proposal. There two points, which I like to discuss:
Shall we echo the input parameters in the response message? As @mmccool mentioned in today's call, how about the situation having big input parameters? E.g., there is a convertPhoto
action where you can submit JPEG files. Does it make sens to have the origin JPEGs again in the response?
Do we need a status
element there for sync actions? Would HTTP status codes not be sufficient?
@sebastiankb wrote:
- Shall we echo the input parameters in the response message? As @mmccool mentioned in today's call, how about the situation having big input parameters? E.g., there is a
convertPhoto
action where you can submit JPEG files. Does it make sens to have the origin JPEGs again in the response?
I agree this could be inefficient for large inputs, as with the writeproperty
operation.
One argument for including the input data in the body of the dynamically created resource is that it can then neatly be updated with a PUT request in the updateaction
operation. Actually that makes me realise a couple of things:
updateaction
request should probably be wrapped in an object containing an "input" map, since we don't want to replace the whole resource with just the new input dataupdateaction
request should probably be a PATCH
rather than a PUT
since we are only updating the input member, not output, status or error.E.g.
PATCH /things/lamp/actions/fade/123e4567-e89b-12d3-a456-426655 HTTP/1.1
Host: mythingserver.com
Content-Type: application/json
Accept: application/json
{
"input": {
"level": 50,
"duration": 5
}
{
If we decide we don't need the updateaction
operation then that's less of an issue and we can just omit input
altogether, but we may just be storing up problems for the future.
Is there some other way we can mitigate the issue of large inputs? @mmccool suggested just including a hash for example. Could we truncate large values? How do other hypermedia systems and APIs deal with that issue?
- Do we need a
status
element there for sync actions? Would HTTP status codes not be sufficient?
I wondered this too. I concluded that given there's no way to guarantee that all actions can be completed within an HTTP timeout period, it's still useful for the consumer to know if the invoked action is still pending or running when the HTTP response comes back, even if a dynamic resource is not created to track its status.
Is there some other way we can mitigate the issue of large inputs?
How about introducing a sub-resource where the input parameters of the invoked action can be queried. E.g.,
GET /things/lamp/actions/fade/123e4567-e89b-12d3-a456-426655/input HTTP/1.1
Host: mythingserver.com
Accept: application/json
The response can look like:
HTTP/1.1 200 OK
Content-Type: application/json
{
"level": 50,
"duration": 5
}
The advantages are, that the client can decide to check the input parameters and the usual queryaction response will be more compact.
I wondered this too. I concluded that given there's no way to guarantee that all actions can be completed within an HTTP timeout period, it's still useful for the consumer to know if the invoked action is still pending or running when the HTTP response comes back, even if a dynamic resource is not created to track its status.
I had quick look into XML-RPC. If everything is ok, simply the return value is provided without a status code in the response message. If something went wrong, the response message is different with a detailed error message. We could also do this by the usage of the additionalResponse feature in the TD. What do you think?
Another alternative would be to make input optional. Consumers can even know it ahead of time by checking the output
DataSchema
. Considering that the updateaction
is a less common use case I think it makes sense, in the end, a consumer that wants to update the resource would just do a check either before invoking the action (thanks to the DataSchema
) or after (checking output.input !== undefined
).
Do you see any downsides?
I suggest we implement the decision from the architecture call and create a PR with the sections of the current proposal that we agreed upon in the call, i.e. to include invoke, query and cancel into the draft.
@benfrancis - We can extend the branch/PR of https://github.com/w3c/wot-profile/pull/88 and evolve it, or do you prefer to create a separate PR?
This discussion about input parameters and whether it is optional in the response is very useful and should be continued in the next architecture/profile call. We can then incrementally refine and clarify these questions.
A JSON schema would be very helpful to have a proposal that we can agree on and can include into the spec. @relu91 do you think you could help?
Yes, sure! So what I had in mind was something like this:
{
// A TD action description
"newAction" :{
"title": "newAction",
"description": "",
"input": {
"type": "object",
"properties": {
"type": "object",
"property": {
"foo": {
"type": "string"
}
}
}
},
"output": {
// according to what is described above an ActionStatus can be described with this schema
"type": "object",
"properties": {
"input" : {
"type": "object",
"property": {
"foo": {"type":"string"}
}
},
"output": {
"type": "string" // The actual output of the action. it can be anything
},
"status": {
"type": "string",
"enum": [
"pending",
"running",
"completed",
"failed"
]
},
"error": {
"type": "object",
"description": "An error object according to RFC 7807",
"properties": {
"type": { "type": "string"},
"title": { "type": "string"},
"status": { "type": "string"},
"detail": { "type": "string"},
"instance": { "type": "string"}
}
}
},
"required": [ "status", "input" ] // here I know that the response will have the input field
},
"forms": []
}
}
As you can see using the required array I can state that the input it will be always returned in the response for the invokeaction
operation. We can express also mixed situations where the input
field might be there if needed removing it from the required array (e.g. "required": [ "status"]
). Or we can defitly says that i won't be there just removing it from the properties
object:
{
"output": {
// according to what is described above an ActionStatus can be described with this schema
"type": "object",
"properties": {
// no more input defined
"output": {
"type": "string" // The actual output of the action. it can be anything
},
"status": {
"type": "string",
"enum": [
"pending",
"running",
"completed",
"failed"
]
},
"error": {
"type": "object",
"description": "An error object according to RFC 7807",
"properties": {
"type": { "type": "string"},
"title": { "type": "string"},
"status": { "type": "string"},
"detail": { "type": "string"},
"instance": { "type": "string"}
}
}
},
"required": [ "status" ]
},
}
Note: the JSON schema might not be accurate to the spec defined by @benfrancis, it is meant to be just mean to explain my previous comment. We can describe further during the call and maybe refining it inside a PR.
Please see #89 for a first draft of specification text to describe invokeaction
, queryaction
and cancelaction
.
@relu91 wrote:
So what I had in mind was something like this ...
It's probably worth noting at this stage that I'd ideally like to get to a point where a Web Thing conformant with the Core Profile could provide a very simple Thing Description like...
{
"@context": "https://www.w3.org/2019/wot/td/v1",
"id": "urn:ex:thing",
"title": "My lamp",
"profile": "https://www.w3.org/2021/wot/profile/core",
"security": { ... },
"actions": {
"fade": {
"input": {
"type": "number",
"description": "duration in ms"
},
"forms": [ { "href": "/fade" } ]
}
...then a conformant Consumer would see that the Web Thing supports the Core Profile and by applying all the defaults defined in the profile specification would arrive at a much more comprehensive canonical Thing Description much like the one you have provided above, or the one in https://github.com/w3c/wot-thing-description/issues/302#issuecomment-802159857 with the full set of operations defined. This would mean that Web Things which support the Core Profile don't have to worry about all the complexities of dealing with multiple forms declarative protocol bindings for dynamic resources and can just provide a single HTTP endpoint for an action affordance which is then expanded out into the full set of operations for free. I see this as an extension of the current set of defaults in the Thing Description specification.
I completely agree - simplicity is one of the primary goals of the profile spec.
Regarding the simplicity argument of @mlagally , this does not make an implementation simpler, only its non canonical TD
@benfrancis wrote:
this does not make an implementation simpler, only its non canonical TD
Currently the Thing Description specification puts no constraints on the protocols that Web Things may use or the complexity of their protocol bindings, which makes it effectively impossible to implement a Consumer that can support any Web Thing.
If we accepted that a Consumer which implements support for the Core Profile does not have to support Web Things which don't conform with the profile, then actually it could drastically simplify implementations. This is because although it may be possible to expand a simplified TD into a more complex canonical TD with declarative protocol bindings describing every little detail, Consumers would not necessarily need to support other declarative protocol bindings which don't conform with the profile.
e.g. a Consumer conformant with the Core Profile may support a queryaction
operation which follows the protocol binding and data schema defined in the Core Profile, but not support a queryaction
operation which uses some other approach using a declarative protocol binding in a Form
.
There are two fundamentally different approaches to acotions:
synchronous actions These are the baseline and need to be supported in any case. We have to define the set of error conditions and a way to communicate / signal a timeout. This should not be too hard.
asynchronous actions It is easy to create a can of worms with race conditions if we don't get it right and mess up the design. This can get arbitrarily complex, if we think of non-atomic transactions, rollbacks, conflicting actions etc.
For these I suggest the following approach:
An action can return a "status" object, which can be used to query (i.e. poll) whether the action has been completed and returns the result. The caller has only "read-only" access on this object - there's no way to cancel an action using this object. If an action should be cancellable, a separate "cancel_" can be defined by the TD,
which does the right thing.