Closed benfrancis closed 2 years ago
To provide some additional context for people who don't want to read the Web Thing API specification...
To request an action, the Web Thing API uses a POST request, e.g.
POST https://mythingserver.com/things/lamp/actions/fade
Accept: application/json
{
"fade": {
"input": {
"level": 50,
"duration": 2000
}
}
}
Response:
201 Created
{
"fade": {
"input": {
"level": 50,
"duration": 2000
},
"href": "/things/lamp/actions/fade/123e4567-e89b-12d3-a456-426655"
"status": "pending"
}
}
You can get a list of current action requests.
Request:
GET /things/lamp/actions/fade
Accept: application/json
Response:
200 OK
[
{
"fade": {
"input": {
"level": 50,
"duration": 2000
},
"href": "/things/lamp/actions/fade/123e4567-e89b-12d3-a456-426655",
"timeRequested": "2017-01-25T15:01:35+00:00",
"status": "pending"
}
},
{
"fade": {
"input": {
"level": 100,
"duration": 2000
},
"href": "/things/lamp/actions/fade/123e4567-e89b-12d3-a456-426655",
"timeRequested": "2017-01-24T11:02:45+00:00",
"timeCompleted": "2017-01-24T11:02:46+00:00",
"status": "completed"
}
}
]
You can get the status of an action request.
Request:
GET /things/lamp/actions/fade/123e4567-e89b-12d3-a456-426655
Accept: application/json
Response:
200 OK
{
"fade": {
"input": {
"level": 50,
"duration": 2000
},
"href": "/things/lamp/actions/fade/123e4567-e89b-12d3-a456-426655",
"timeRequested": "2017-01-25T15:01:35+00:00",
"status": "pending"
}
}
You can cancel an action request.
Request:
DELETE /things/lamp/actions/fade/123e4567-e89b-12d3-a456-426655
Response:
204 No Content
You can also get a list of action requests of all types with a GET request to an Actions resource (whose URL is provided by the top level links member).
Request:
GET /things/lamp/actions
Accept: application/json
Response:
200 OK
[
{
"fade": {
"input": {
"level": 50,
"duration": 2000
},
"href": "/things/lamp/actions/fade/123e4567-e89b-12d3-a456-426655",
"timeRequested": "2017-01-25T15:01:35+00:00",
"status": "pending"
}
},
{
"reboot": {
"href": "/things/lamp/actions/reboot/124e4568-f89b-22d3-a356-427656",
"timeRequested": "2017-01-24T13:13:33+00:00",
"timeCompleted": "2017-01-24T13:15:01+00:00",
"status": "completed"
}
}
]
And for completeness you can also request an action on the top level Actions resource if you want to.
Request: POST https://mythingserver.com/things/lamp/actions/ Accept: application/json
{
"fade": {
"input": {
"level": 50,
"duration": 2000
}
}
}
Response:
201 Created
{
"fade": {
"input": {
"level": 50,
"duration": 2000
},
"href": "/things/lamp/actions/fade/123e4567-e89b-12d3-a456-426655"
"status": "pending"
}
}
How can all of this be expressed in a Thing Description, including the payload formats and possible error responses?
By contrast, in Arena, HTTPS POST takes the action input as the body of the request, and returns the action output as the body of the response. This follows the basic semantics for POST. If a developer wants a cancellable process that is initiated by an action, that can be layered on top of the core patterns of actions and events. You could return a process ID for the action that initiates the process, and provide another action to cancel an active process using the process ID. You can likewise define a progress event that passes the process ID together with status information. My take is thus that the core semantics for the Web of Things should be really simple and more complex models can be layered on top. This principle also applies to APIs for event logs and for querying a history of property updates in the case of telemetry streams.
If the argument for forms vs. links is for backwards compatibility with existing IoT APIs using declarative protocol bindings, then it should be possible to describe the API described above using a Thing Description alone. Describing Mozilla's Web Thing API should be a particularly easy example as it was developed in parallel with the Thing Description specification and its data model is already aligned.
The solution you describe requires changing the API itself, and doesn't explain how the client would know from the Thing Description that the ID returned by the action request can be used to cancel that request by using a different action. Can you provide an example Thing Description which provides the client with all of that information in a declarative protocol binding?
By contrast, in Arena, HTTPS POST takes the action input as the body of the request, and returns the action output
What happens if the action is a long running process which doesn't complete by the time the HTTP request times out? (This was the reason we created the action queue API, and is a difference between requesting an action and and simply setting a property)
Developers are responsible for documenting the purpose of properties, actions and events. Picking appropriate names would help, e.g. an action called cancelProcess with an input named processID.
The timeout for HTTP requests is often client dependent. We could standardise how to express an indication of the maximum expected duration of a long lived process as part of the metadata for an action. Alternatively, developers could use the processID design pattern described above.
Developers are responsible for documenting the purpose of properties, actions and events. Picking appropriate names would help, e.g. an action called cancelProcess with an input named processID.
This assumes the involvement of a human to interpret those names and write custom code for that specific web thing, which doesn't allow for ad-hoc interoperability.
The timeout for HTTP requests is often client dependent. We could standardise how to express an indication of the maximum expected duration of a long lived process as part of the metadata for an action. Alternatively, developers could use the processID design pattern described above.
The duration of an action may be user defined e.g. an action to fade a light from 0 to 100% brightness over the course of 1 hour.
Again, your proposed solution requires changing the API. But how would you get the current status of an action in this model? e.g. to find out if the action succeeded or failed.
I agree that when it comes to describing existing services using a standardised machine interpretable format, this gets increasingly complicated, and I question whether this complexity is justified. In the longer time frame we can and should encourage convergence in protocols and their usage across the Internet and this would render declarative protocol bindings a historical legacy that is no longer needed.
An alternative is to provide a platform ID that web of things clients can used to identify what protocols and usage patterns apply to a given thing. This avoids the need for complex representations in every TD.
The point I'm making here is that it's not just difficult to be backwards compatible with existing APIs, it may actually be impossible without significantly more complexity than is currently allowed for in the current specification.
Is anyone willing to make a stab at describing Mozilla's Web Thing REST API in a Thing Description alone? The issue described here is just one of several problems with trying to do that.
(I think we've already agreed that the Web Thing WebSocket API can not be described in a Thing Description and would require a separate WebSocket subprotocol specification.)
The duration of an action may be user defined e.g. an action to fade a light from 0 to 100% brightness over the course of 1 hour.
Yes, but the developer should have some understanding of what the maximum is likely to be before the process can be considered to have failed. That expectation could be given as a metadata property.
Again, your proposed solution requires changing the API.
Yes, but see my previous post that questions the long term commercial need for a complex declarative protocol binding standard.
But how would you get the current status of an action in this model? e.g. to find out if the action succeeded or failed.
You would listen to the status events. In addition, I would design the server and the application to be robust against a loss of network connectivity, the reboot of the client or server, etc.
We have been discussing this for a long time -- since IG-only work -- and also came to good conclusion about this. The ideal way to this is using hypermedia where a running Action is represented as a Web resource that is dynamically created upon invokaction. This running Action itself can have Properties and Actions itself again. Maybe you remember the discussions we had around a potential application/wot+json
media type for this.
We do have all required extension points in place for this: The output of an Action can be such an application/wot+json
representation or potentially other hypermedia formats such as CoRAL.
To date, support for this is very limited in existing systems; hypermedia concepts are almost nil. Thus, we decided to focus on description of deployed systems for now and tackle this issue in the next charter period or in the IG first. The closest we have are custom "ticket responses" that each platform does in a different style. This must be solved by semantically describing the response content and leave it to the application. @draggett described this approach in his comment further up.
We have been discussing this for a long time -- since IG-only work -- and also came to good conclusion about this. The ideal way to this is using hypermedia where a running Action is represented as a Web resource that is dynamically created upon invokaction. This running Action itself can have Properties and Actions itself again. Maybe you remember the discussions we had around a potential application/wot+json media type for this.
My very old proposal was to support things as first class types. However, I agree that this something we can leave to future extensions given the subtleties involved.
support things as first class types
Simply set the content type to application/td+json
, done...
That works for limited cases, but isn't a general solution. Object oriented programming languages support objects as first class types, so having things as first class types is something that will be expected for the web of things. At the protocol level we can pass things using the URI for their thing description, or as you suggest, by passing the JSON-LD for the thing description, both are a form of reference to a thing. Interestingly, by passing the JSON-LD explicitly, this corresponds to giving the thing a blank node for its RDF identifier.
If the TD for a thing includes declarations of initial values, the platform should carry out the initialisation. If this involves a thing, the platform needs to retrieve the thing's TD, if not supplied in place, and initialise that thing. This can get a little complicated when you need to deal with forward references, and when the dependencies between things form cycles. I showed how to handle that over two years ago, proving that it is a tractable problem, just as it is for object oriented programming languages.
Here is a proposal: in a future version of the TD model, we could at least standardize new operation types to cancel, query (and update?) an invoked action: something like cancelaction
, queryaction
, updateaction
. Each operation type would not necessarily be used in a TD directly but it could be used as part of a Link header or some hypermedia-aware response payload to drive WoT consumers.
Having said that, it is possible already with the current TD spec to specify operations on dynamically created resources. For that, you can define a generic action manage
that declares forms on these resources (what you call ActionRequest
s in Mozilla's WebThings, @benfrancis) Here is a try:
{
"@context": [
"https://www.w3.org/2019/wot/td/v1",
{
"ActionRequest": "http://example.org/ActionRequest",
"cancelaction": "http://example.org/cancelActionOperationType",
"queryaction": "http://example.org/queryActionOperationType"
}
],
"id": "urn:example:mylamp",
"actions": {
"fade": {
"input": {
"type": "object",
"properties": {
"level": { "type": "number" },
"duration": { "type": "duration" }
}
},
"output": {
"type": "object",
"properties": {
"href": {
"@type": "ActionRequest",
"type": "string"
},
"status": {
"enum": [ "pending", "completed" ]
}
}
},
"forms": [
{
"href": "https://mythingserver.com/things/lamp/actions/fade",
"op": "invokeaction"
}
]
},
"manage": {
"uriVariables": {
"actionRequest": {
"@type": "ActionRequest",
"type": "string"
}
},
"forms": [
{
"href": "{actionRequest}",
"htv:methodName": "DELETE",
"op": "cancelaction"
},
{
"href": "{actionRequest}",
"htv:methodName": "GET",
"op": "queryaction"
}
]
}
}
}
In this example, you can see that I use cancelaction
and queryaction
but since they do not exist yet in the TD model, I declared them in the JSON-LD context. Same thing for the class ActionRequest
, which indicates the output of action invokation is the same as the actionRequest
URI variable.
However, I would also expect (or wish) that future WoT Things have a more hypermedia-driven interface to consumers. In that case, the cancelaction
and queryaction
operations could be added to each array item in the response of GET /things/lamp/actions/fade
.
As @mkovatsc and @draggett said, the same structure as in the TD model could be used for the JSON response. I would suggest one conceptual variant, though: to me, reserving the class Thing
for physical objects is important. So, it means that if a new "TD" is returned after invoking an action, it should be interpreted as an extension of the original TD, as if new actions on the same Thing
were made available. It makes a big difference when dealing with the semantics of TD documents but the JSON structure would not be significantly impacted.
I support the idea at the previous comment by @vcharpenay.
In order to bring some more examples to the discussion, we have a PanTilt module (think it like the non-camera part of a CCTV camera) where a stopMovement
action can stop any ongoing movements. The source code and TD can be found here.
In addition to actions that take a long time, there can be actions that are started via a request but the physical action never stops. From the previous example, it would be the moveContinuously
and panContinuously
actions, where the invokeaction
request starts the movement and the movement doesn't stop until it hits a limit or a stopMovement
action is invoked. A more familiar example would be a conveyor belt that is started with an action and stopped with another action.
A hypermedia based approach was the first one that came to mind but I was not sure how one would describe it, since execution of a form with an op invokeaction
would need to return some information that is used by a form with another op value. I think the comment above goes in the right direction by taking this into account. Just that I think it would be better to not introduce another action and maybe pack the manage
action into the fade
action.
As discussed in the TD call on 7.2. we are looking at different examples.
Oracle's IoT Cloud service has a hypermedia-based action model that supports synchronous and asynchronous operations.
The response payload contains a key "complete", when the operation is already finished, otherwise the url endpoint contains a link to asynchronously query the status.
{
"complete":false,
"id":"72a4239f1644-ccf",
"endpointId":"6248475d6e28-3013",
"url":"https://iotserver/iot/api/version/resource/path",
"method":"Request method",
"status":"Request statusOne of [RECEIVED, DISPATCHED, COMPLETED, EXPIRED, FAILED, UNKNOWN].",
"requestTime":"2016-07-22T10:44:57.746Z",
"responseTime":"Time when the response is received by server",
"responseEventTime":"2016-07-22T10:44:57.746Z", "responseStatusCode":"Request status code from the response message (One of [HTTP 200: OK, HTTP 201: Created, HTTP 202: Accepted, HTTP 203: Non Authoritative Information, HTTP 204: No Content, HTTP 400: Bad Request, HTTP 401: Unauthorized, HTTP 402: Payment Required, HTTP 403: Forbidden, HTTP 404: Not Found, HTTP 405: Method Not Allowed, HTTP 406: Not Acceptable, HTTP 408: Request Timeout, HTTP 409: Conflict, HTTP 500: Internal Server Error, HTTP 502: Bad Gateway, HTTP 503: Service Unavailable].)",
"response":"Original response message payload JSON document"
}
Here's the full API decumentation for Invoke action: https://docs.oracle.com/en/cloud/paas/iot-cloud/iotrq/op-iot-api-v2-apps-app-id-deviceapps-devapp-id-devicemodels-devicemodel-id-actions-action-name-post.html
Starting point for the API documentation: https://docs.oracle.com/en/cloud/paas/iot-cloud/iotrq/toc.htm
I would like to point out that Thing-Consumer protocol should always look forward, but not backward.
This means, it had not better depend on transaction model where you can "cancel" a request while it is in action.
Consumer should be able to make an independent "cancel" request to a Thing, and the Thing makes a best effort to fulfill the request. The fulfillment may be just stop the action, or Thing may wait the action to finish (if it cannot be stopped immediately) and revert to the original state if possible.
Here, note that a Thing may be able to process the cancel request even after the original request was complete. This is why I said Thing-Consumer protocol should always look forward. Things can decide how best to process the "cancel" request because it just one of the subsequent action requests.
I like the proposal. There's one aspect to consider: Is the cancel operation synchronous or asynchronous? If it is asynchronous, would it be possible to abort a long-lasting cancel operation that does not complete?
For an async operation the cancellation should also be async. Usually cancelling cannot be guaranteed, so it is always best effort. Therefore Things need to be designed
Does it make sense to introduce a hypermedia-specific navigation term that gives an indication of where the resource is defined in the payload message that can be used to query the status or cancel an action? E.g., in the case of Oracle it would be
{
"forms": [
{
"href": "...",
"op": "invokeaction",
"hypermedia" : "url" //--> points to the JSON term of the response payload message
}
]
}
for Mozilla it would look like
{
"forms": [
{
"href": "...",
"op": "invokeaction",
"hypermedia" : "href" //--> points to the JSON term of the response payload message
}
]
}
Btw: I have checked the MDSP API, and there seems to be no use of the hypermedia approach yet.
@sebastiankb I don't understand how this would work. Could you provide a more complete example for "url" and "href".
@benfrancis The idea is to provide a hint in the TD (in the forms
container of actions) where the client can identify the name-urlValue pair in the response message, which can be used asynchronously to query action's status.
In my example above, the hint is given by the term hypermedia
(maybe not the perfect name for it). The value 'url' (Oracle) or 'href' (Mozilla IoT) indicates the JSON name used in the corresponding response message. E.g., in the case of Oracle the client would identify the entry
"url":"https://iotserver/iot/api/version/resource/path"
which can then used to query the status.
I hope this is clearer now.
@sebastiankb Oh I see, yes that is clearer now thank you.
That tells the client how to find out the URL of an action request resource, but how would the client know what format to expect for that resource? I've provided an example flow below.
The client requests an action.
POST https://mythingserver.com/things/lamp/actions/fade
Accept: application/json
{
"fade": {
"input": {
"level": 50,
"duration": 2000
}
}
}
The server responds with the URL of the created action request resource.
201 Created
{
"fade": {
"input": {
"level": 50,
"duration": 2000
},
"href": "/things/lamp/actions/fade/123e4567-e89b-12d3-a456-426655"
"status": "pending"
}
}
So far so good. The client knows where to find the newly created action request resource.
The client requests the status of the action request.
GET /things/lamp/actions/fade/123e4567-e89b-12d3-a456-426655
Accept: application/json
The server responds with its current status.
200 OK
{
"fade": {
"input": {
"level": 50,
"duration": 2000
},
"href": "/things/lamp/actions/fade/123e4567-e89b-12d3-a456-426655",
"timeRequested": "2017-01-25T15:01:35+00:00",
"status": "pending"
}
}
How does the client know what format to expect from this response to determine the status of the action request, or how to modify or cancel the action request?
I guess there are different aspects to look at here:
At the moment, we are able to describe only the payload of the response from invoking an action by using the output
term. We can thus imagine adding new terms on the action affordance level like querystatus
, cancellation
, modification
which all have input
and output
fields.
output
properties. For example:
{
"output":{
"type":"object",
"properties":{
"key1":{"type":"number"},
"url":{
"type":"string",
"@type":"hypermedia" //or some other semantic annotation
}
}
}
}
In @sebastiankb 's previous examples, "url"
or "href"
SHOULD be described in the DataSchema of the output of the action affordance anyways, so we can add a semantic annotation and not need to change the forms. Doing this with the forms would have the following disadvantages:
done
@benfrancis I sent you an e-mail to the address I found on your personal website.
as agreed in today's TD call we would like to test an approach in the next PlugFest. The approach will contain:
@vcharpenay will provide a proposal for it based on past discussions.
Long and windy road ahead!
After talking with @sebastiankb, @danielpeintner and @wiresio , I summarize our findings. If anything is missing, feel free to edit this comment or add comments below.
op
). Different management/hypermedia related operations should have an op value and form.We are not sure if there can be other requirements that need to be considered.
For the TD example below, think of a robot arm that is in a position of 50 degrees. An action can be invoked to rotate it for a given amount of time and speed. The Consumer should be able to invoke this action, query its status (still rotating or finished), change the speed of rotation, cancel the rotation and once the action finishes, the robot should tell its final position (output of the action).
{
...
"actions": {
"rotate": {
"input": {
"type": "object",
"properties": {
"url": {
"@type": "ActionQueryInput",
"type": "string"
},
"duration": {
"type": "number",
"@type": "ActionInvokeInput"
},
"speed": {
"type": "number",
"@type": ["ActionInvokeInput","ActionModifyInput"]
}
}
},
"output": {
"type": "object",
"properties": {
"url": {
"@type": "ActionQueryOutput", //["ActionQueryOutput","ActionQueryURI"]
"type": "string"
},
"status": {
"@type": "ActionQueryOutput",
"type": "string",
"enum": [
"completed",
"rotating"
]
},
"currentPosition": {
"@type": "ActionResult",
"type": "number"
}
}
},
"forms": [
{
"href": "https://myrobot.example.com/rotate",
"htv:methodName": "DELETE",
"op": "invokeaction"
},
{
"href": "{ActionQueryInput}", //"{ActionQueryURI}"
"htv:methodName": "DELETE",
"op": "cancelaction"
},
{
"href": "{ActionQueryInput}", //"{ActionQueryURI}"
"htv:methodName": "GET",
"op": "queryaction"
},
{
"href": "{ActionQueryInput}",
"htv:methodName": "PUT",
"op": "modifyaction"
}
]
}
}
...
}
Some considerations:
input
and output
it would be easier to understand and parsethe proposal of @egekorkan's last message doesn't seem far from what I refer to in #899. Shall we continue the discussion in that other thread? I'd like to have more details on the actual messages being sent to/by the robot when GETting and PUTting the action query resource.
In 5/22 TD telecon, it was noted we should consider pros and cons of both @egekorkan 's and @vcharpenay 's proposals and merge them together.
@egekorkan will first need to make his alternative proposal concrete in a separate document. After that the TD TF compare the two proposals side by side.
Also, I have the feeling that we are not looking at existing documents (not really standards) that talk about hypermedia. In the end, hypermedia is as old as REST and there is quite some material already:
Support for action queues is in the current charter and I'm conscious we haven't come up with a solution for this yet. This is also needed in order to make WebThings W3C compliant (see WebThingsIO/gateway#2806 and WebThingsIO/gateway#2807).
The closest I have seen to a solution to this problem is @egekorkan's Hypermedia Control 2 proposal:
- introduce new operation types
queryaction
,updateaction
,cancelaction
- introduce new fields
query
,update
andcancel
to action affordances that map to payload information ofqueryaction
,updateaction
andcancelaction
, respectively.input
andoutput
to each previously introduced terms
Below is an example Thing Description which illustrates how this could work (combined from two examples in the proposal):
{
"@context": "https://www.w3.org/2019/wot/td/v1",
"id": "urn:ex:thing",
"actions": {
"fade": {
"input": {
"type": "number",
"description": "duration in ms"
},
"output":{
"type":"object",
"properties":{
"href":{
"const":"{id}",
"description": "URI to query, update or cancel the invoked action"
},
"status":{
"type":"string",
"enum":["ongoing","finished","pending"],
"description": "status of the invoked action"
}
}
},
"query":{
"output":{
"type":"object",
"properties":{
"brightness":{
"type":"number",
"description": "current brightness"
},
"status":{
"type":"string",
"enum":["ongoing","finished","pending"],
"description": "status of the invoked action"
}
}
}
},
"update":{
"input": {
"type": "number",
"description": "ADDED duration in ms"
}
},
"cancel":{
},
"forms": [
{
"href": "/fade",
"op": "invokeaction",
"htv:methodName": "POST",
"contentType":"application/json"
},
{
"href": "/fade/{id}",
"op": "queryaction",
"htv:methodName": "GET",
"contentType":"application/json"
},
{
"href": "/fade/{id}",
"op": "updateaction",
"htv:methodName": "PUT",
"contentType":"application/json"
},
{
"href": "/fade/{id}",
"op": "cancelaction",
"htv:methodName": "DELETE",
"contentType":"application/json"
}
]
}
}
}
One remaining issue I see with this proposal is how consumers will know to map the {id}
from the output
of invokeaction
to the {id}
in the href
of forms
of queryaction
, updateaction
and cancelaction
.
One approach might be to add semantic annotations to output
and uriVariables
which assign semantic meaning to these values so consumers know they have a special meaning. E.g.
{
"@context": "https://www.w3.org/2019/wot/td/v1",
"id": "urn:ex:thing",
"actions": {
"fade": {
"input": {
"type": "number",
"description": "duration in ms"
},
"output":{
"type":"object",
"properties":{
"href":{
"const":"{id}",
"@type": "ActionRequestID",
"type": "string",
"description": "URI to query, update or cancel the invoked action"
},
"status":{
"type":"string",
"enum":["ongoing","finished","pending"],
"description": "status of the invoked action"
}
}
},
"query":{
"output":{
"type":"object",
"properties":{
"brightness":{
"type":"number",
"description": "current brightness"
},
"status":{
"type":"string",
"enum":["ongoing","finished","pending"],
"description": "status of the invoked action"
}
}
}
},
"update":{
"input": {
"type": "number",
"description": "ADDED duration in ms"
}
},
"cancel":{
},
"forms": [
{
"href": "/fade",
"op": "invokeaction",
"htv:methodName": "POST",
"contentType":"application/json"
},
{
"href": "/fade/{id}",
"op": "queryaction",
"htv:methodName": "GET",
"contentType":"application/json"
},
{
"href": "/fade/{id}",
"op": "updateaction",
"htv:methodName": "PUT",
"contentType":"application/json"
},
{
"href": "/fade/{id}",
"op": "cancelaction",
"htv:methodName": "DELETE",
"contentType":"application/json"
}
],
"uriVariables": {
"id": {
"@type": "ActionRequestID",
"type": "string",
"description": "URI to query, update or cancel the invoked action"
},
}
}
}
Note: I'm not 100% sure of the intended meaning of the const keyword from JSON Schema. @egekorkan Can you explain?
What do people think about this solution? I'd like to understand whether this is likely to make WoT Thing Description 1.1 so we know whether we need to drop the action queue feature from all 17 implementations in WebThings in order to be W3C compliant.
Note: I'm not 100% sure of the intended meaning of the const keyword from JSON Schema. @egekorkan Can you explain?
So const
is just an enum
with a single value. So if the id
returned was always of value "myId123"
then we could have "const":"myId123"
but since it changes based on request, it has the {id}
placeholder instead. If we rely on "@type": "ActionRequestID"
we won't need such a construct and the Consumer should know that the string has a special meaning.
We should evaluate if we can also use the planed additionalSchemas
for this approach. That means we do not need an additional query
term. If I'm correctly we have only to introduce 3 new operation types (cancelaction, updateaction, queryaction), right?
I think that the additionalSchemas
would be good. However, I think we should prescribe the keys in those schemas, i.e. additionalRes_I
should not be allowed and it should be query
.
See https://github.com/w3c/wot-profile/issues/81#issuecomment-880619349 for a proposal of how this could work in the Core Profile.
Below is an attempt to write a Thing Description which describes the Core Profile Protocol Binding for asynchronous actions proposed in https://github.com/w3c/wot-profile/pull/89, using the schemaDefinitions
feature discussed in https://github.com/w3c/wot-thing-description/issues/1053.
{
"@context": "https://www.w3.org/2019/wot/td/v1",
"id": "urn:ex:thing",
"actions": {
"fade": {
"input": {
"type": "object",
"properties": {
"level": {
"type": "integer",
"minimum": 0,
"maximum": 100
},
"duration": {
"type": "integer",
"minimum": 0,
"unit": "milliseconds"
}
}
},
"output": {},
"schemaDefinitions": {
"actionStatus": {
"output": {},
"status": {
"type": "string",
"enum": [ "pending", "running", "completed", "failed" ]
},
"error": {
"type": "object"
}
}
},
"forms": [
{
"href": "/fade",
"op": "invokeaction",
"htv:methodName": "POST",
"contentType":"application/json",
"response": {
"htv:headers": [
{
"htv:fieldName": "Location",
"htv:fieldValue": "/fade/{id}"
}
]
}
},
{
"href": "/fade/{id}",
"op": "queryaction",
"htv:methodName": "GET",
"response": {
"contentType":"application/json",
"schema": "actionStatus"
}
},
{
"href": "/fade/{id}",
"op": "cancelaction",
"htv:methodName": "DELETE",
"contentType":"application/json"
}
],
"uriVariables": {
"id": {
"type": "string",
"description": "identifier of action request"
},
}
}
}
Notes:
invokeaction
operation does not follow the output
data schema because an asynchronous response to an action invocation does not include the output of the action. Rather, the output
schema is used as part of the actionStatus data schema in the follow-up queryaction
operation. Separating the output data schema from the response data schema is one of the topics discussed in https://github.com/w3c/wot-thing-description/issues/1053output
schema from the actionStatus
schema. Is a JSON pointer appropriate here?schema
member currently only seems to be allowed in an AdditionalResponse
(added in https://github.com/w3c/wot-thing-description/commit/100f0de1d8d608e7c3c3420b1c4e5f68a5afa628), not an ExpectedResponse
. That would need changing.{id}
in the Location
header of the response to the invokeaction
request corresponds to the {id}
used in the href
of other operations? I can't think of a way semantic annotations would help in this case.Location
header?ExpectedResponse
and an AdditionalResponse
? OpenAPI does this by keying responses by status code, but I think the decision was not to do that for additionalResponses
since it would be too protocol specific. I can't find vocabulary in the Protocol Binding Templates specification to describe an HTTP status code.On the Thing Description call today we discussed proposed invokeanyaction
/queryallactions
operations.
In that issue I noted that there are three potential use cases for "querying" an action:
- Getting an individual
ActionStatus
resource regarding an individual action request (e.g.GET /actions/fade/1935-5939-ngu3
)- Getting a list of pending action requests for a given action (e.g.
GET /actions/fade
)- Getting a list of pending action requests for all actions (e.g.
GET /actions
)
Do we need two operations for Action affordances which distinguish between the first two? E.g. queryaction
vs. queryactionrequest
?
Do we need two operations for Action affordances which distinguish between the first two? E.g. queryaction vs. queryactionrequest?
I think, this is a similar analogy to readproperty and readallproperties. In this context it would make sense to have two. Maybe we should use the term queryallactions instead. Option 2 and 3 can be supported by a Thing implementation and be announced at the top level forms:
"forms": [
{
"op": ["queryallactions"],
"href": "./actions/{ACTION_NAME}"
},
{
"op": ["queryallactions"],
"href": "./actions"
}
]
@sebastiankb wrote:
I think, this is a similar analogy to readproperty and readallproperties.
I agree in that no. 2 is like readproperty
and no. 3 is like readallproperties
, but if we were following that example then no. 2 should be in the Action affordance, not a top level form. There's is no equivalent of no. 1 for properties because a Property only has one value, whereas an Action may have multiple instances.
"forms": [ { "op": ["queryallactions"], "href": "./actions/{ACTION_NAME}" }, { "op": ["queryallactions"], "href": "./actions" } ]
I agree the same name makes sense for both operations, but it may be tricky to define how a Consumer distinguishes between the two if they share the same name.
I wish there was a word in the English language for an instance of an action, but I can't think of one. Some other ideas...
1.
queryaction
- in a Form in the Action affordancelistactions
- in a Form in the Action affordancelistallactions
- in a top level Form
queryactionstatus
- in a Form in the Action affordancequeryaction
- in a Form in the Action affordancequeryallactions
- in a top level Form
queryaction
- in a Form in the Action affordancereadactionqueue
- in a Form in the Action affordancereadallactionqueues
- in a top level FormI agree in that no. 2 is like readproperty and no. 3 is like readallproperties, but if we were following that example then no. 2 should be in the Action affordance, not a top level form.
Yes, thats makes sense.One idea is to design the top-level form to inform the client that it can query actions with a filter by specifying the name of the actions in the URL, which will return only the status of all active actions with the corresponding action name.
I agree the same name makes sense for both operations, but it may be tricky to define how a Consumer distinguishes between the two if they share the same name.
If we introduce the convention then the client can distinguish based on the URL, right?
I wish there was a word in the English language for an instance of an action, but I can't think of one. Some other ideas...
I would prefer no. II.
regarding @benfrancis comment I will put this to the agenda of today's TD call.
Just one argument regarding having readaction
based verbs for the op: What if in the future we see that there is also a use case for observing an action where the Consumer gets the changes to the state of the action? It might be good to make it aligned with properties.
An opposing argument based on the same "worry" I have: We should make sure that op keywords are different enough that a newcomer does not confuse actions with properties.
Yet another comment:
Reading a property and querying an action are semantically very close. One can say that invoking an action creates a property affordance that is simply temporary, thus having readaction
make sense
Note that the example Thing Description in https://github.com/w3c/wot-thing-description/issues/302#issuecomment-884867648 is now out of date. Following a review of the proposed action protocol binding for the Core Profile, both the synchronous and asynchronous responses follow the same data schema, which has been expanded to include a href
member. I've tried to provide an updated example Thing Description below which covers both cases, but it's not easy.
{
"@context": "https://www.w3.org/2019/wot/td/v1",
"id": "urn:ex:thing",
"actions": {
"fade": {
"input": {
"type": "object",
"properties": {
"level": {
"type": "integer",
"minimum": 0,
"maximum": 100
},
"duration": {
"type": "integer",
"minimum": 0,
"unit": "milliseconds"
}
}
},
"output": {},
"schemaDefinitions": {
"actionStatus": {
"status": {
"type": "string",
"enum": [ "pending", "running", "completed", "failed" ],
"required": true
},
"output": {
"required": false
},
"error": {
"type": "object",
"required": false
},
"href": {
"type": "string",
"const": "/fade/{id}",
"required": false
}
}
},
"forms": [
{
"href": "/fade",
"op": "invokeaction",
"htv:methodName": "POST",
"contentType":"application/json",
"response": {
"contentType": "application/json",
"schema": "actionStatus"
},
"additionalResponses": {
"success": "yes",
"contentType": "application/json",
"schema": "actionStatus",
"htv:headers": [
{
"htv:fieldName": "Location",
"htv:fieldValue": "/fade/{id}"
}
]
}
},
{
"href": "/fade/{id}",
"op": "queryaction",
"contentType":"application/json",
"htv:methodName": "GET",
"response": {
"contentType":"application/json",
"schema": "actionStatus"
}
},
{
"href": "/fade/{id}",
"op": "cancelaction",
"htv:methodName": "DELETE"
}
],
"uriVariables": {
"id": {
"type": "string",
"description": "identifier of action request"
}
}
}
}
The notes from above still apply:
Notes:
- The response to the
invokeaction
operation does not follow theoutput
data schema because an asynchronous response to an action invocation does not include the output of the action. Rather, theoutput
schema is used as part of the actionStatus data schema in the follow-upqueryaction
operation. Separating the output data schema from the response data schema is one of the topics discussed in https://github.com/w3c/wot-thing-description/issues/1053- I've included an empty output schema as placeholder since in this particular example the action has no output. But where an action does have an output, I'm not sure of the most appropriate way to link the
output
schema from theactionStatus
schema. Is a JSON pointer appropriate here?- The
schema
member currently only seems to be allowed in anAdditionalResponse
(added in https://github.com/w3c/wot-thing-description/commit/100f0de1d8d608e7c3c3420b1c4e5f68a5afa628), not anExpectedResponse
. That would need changing.- Is it sufficiently obvious to consumers that the
{id}
in theLocation
header of the response to theinvokeaction
request corresponds to the{id}
used in thehref
of other operations? I can't think of a way semantic annotations would help in this case.- Is it OK to use URL templates in the
Location
header?- This TD doesn't currently describe error conditions. Is there a way to specify the status code of an
ExpectedResponse
and anAdditionalResponse
? OpenAPI does this by keying responses by status code, but I think the decision was not to do that foradditionalResponses
since it would be too protocol specific. I can't find vocabulary in the Protocol Binding Templates specification to describe an HTTP status code.
In addition to these notes:
contentType
in a Form refers to the request or the response?const
in a data schema?Overall my impression is that it would be very difficult for a Consumer which didn't explicitly implement the Core Profile Protocol Binding to interpret this Thing Description, but this is the closest I can get to providing a declarative equivalent of the concrete protocol binding described in the specification. Note that my intention is that a Web Thing using the Core Profile would expose a much simpler Thing Description than this, this is just a canonical(ish) example of what it might look like once all the defaults defined in the Core Profile Protocol Binding have been applied, and how the full protocol binding would have to be described for a Consumer which doesn't implement the Core Profile.
I think the important action item here is to decide whether to add the queryaction
and cancelaction
operation names to the Thing Description specification, and what their meta-interaction equivalents in top level forms might be called.
Note that my intention is that a Web Thing using the Core Profile would expose a much simpler Thing Description than this
E.g.
{
"@context": "https://www.w3.org/2019/wot/td/v1",
"id": "urn:ex:thing",
"actions": {
"fade": {
"input": {
"type": "object",
"properties": {
"level": {
"type": "integer",
"minimum": 0,
"maximum": 100
},
"duration": {
"type": "integer",
"minimum": 0,
"unit": "milliseconds"
}
}
},
"output": {},
"forms": [
{
"href": "/fade",
"op": "invokeaction"
},
{
"href": "/fade/{id}",
"op": "queryaction"
},
{
"href": "/fade/{id}",
"op": "cancelaction"
}
],
"uriVariables": {
"id": {
"type": "string",
"description": "identifier of action request"
}
}
}
}
@egekorkan wrote:
Reading a property and querying an action are semantically very close. One can say that invoking an action creates a property affordance that is simply temporary, thus having readaction make sense.
If an operation using an HTTP request like GET /actions/fade/19g3-631g-61gj
was called readaction
, then what would an operation like GET /actions/fade
or GET /actions
be called?
I think the key difference between properties and actions is that a property only has one value at any one time, whereas an action may have multiple running instances (in serial or in parallel). So whilst a property is likely to be bound to a single resource (hence the singular terms readproperty
/writeproperty
) an action may be bound to a collection of resources (i.e. an action queue).
I think we basically need to decide whether the term "action" in operation names refers to:
A) the collection, e.g.
invokeaction
- POST /actions/fade/
cancelactioninstance
- DELETE /actions/fade/19g3-631g-61gj
queryactioninstance
- GET /actions/fade/19g3-631g-61gj
queryaction
- GET /actions/fade
queryallactions
- GET /actions
observeactioninstance
- GET /actions/fade/19g3-631g-61gj Accept: text/event-stream
observeaction
- GET /actions/fade Accept: text/event-stream
observeallactions
- GET /actions Accept: text/event-stream
B) an individual instance of the interaction, e.g.
invokeaction
- POST /actions/fade/
cancelaction
- DELETE /actions/fade/19g3-631g-61gj
queryaction
- GET /actions/fade/19g3-631g-61gj
queryactionlist
- GET /actions/fade
queryallactionlists
- GET /actions
observeaction
- GET /actions/fade/19g3-631g-61gj Accept: text/event-stream
observeactionlist
- GET /actions/fade Accept: text/event-stream
observeallactionlists
- GET /actions Accept: text/event-stream
Which works best?
Not sure if I should comment here or at #1208 but I think that there are some problems when one thinks of the Consumer applications in cases that href
has dynamic ids. Please also have a look at https://github.com/w3c/wot-thing-description/tree/main/proposals/hypermedia-control-2#observations-1 .
An important thing to highlight here is that for many devices there would be no real need to have dynamic ids if we do not want to queue multiple actions. If I am fading a lamp, rotating a robot, sprinkling water on a farm, my Thing can reject subsequent invoke actions if one is already being processed. Dynamic hrefs is more difficult to implement in a Thing and in Consumers so I would not want to promote their use in the TD specification. They should be of course possible to describe and they are needed for the WebThings API as well. Ideally, we should use static hrefs in most examples and then a separate section about how to managed dynamic hrefs in TDs.
In Mozilla's Web Thing API, an action can be requested using an HTTP
POST
request on an Action resource to create an ActionRequest resource. The Action resource is essentially an action queue, consisting of multiple ActionRequest resources.The response to the
POST
provides a unique URL for the ActionRequest resource, which can then have its status queried with aGET
or be cancelled with aDELETE
. A list of all current requests can be retrieved by aGET
on the Action resource.How would this API be described in a Thing Description following the current draft specification? Or is there another intended way to achieve these use cases?