w3c / wot-discovery

Repository for WoT discovery discussion
https://w3c.github.io/wot-discovery/
Other
19 stars 17 forks source link

Consider TTL Use cases #18

Closed mmccool closed 3 years ago

mmccool commented 4 years ago

When is TTL mandatory? When is it not? Note: In LinkSmart, TTL is optional. Ex: Registration is created by a human using a UI, expected that human will remove it when it is no longer valid. But when added by a device, TTL should be mandatory.

farshidtz commented 4 years ago

Elaborating the use-cases for optional TTL in Thing Description (TD) of LinkSmart's Thing Directory:

In that context, the TTL is the lifespan of the TD registration after which the directory removes the entry in the absence of any keepalive request. In other words, the directory intends to keep TDs as long as the actual thing exists and removes TDs that have time > modified+TTL. The consumers may rely on TTL to know when (i.e. modified+TTL) this registration becomes obsolete, after which they have to discover the TD again to get the up to date metadata.

The keepalive request is currently sent by the registrator via PUT (full TD) or PATCH (partial TD / no body) request over the RESTful API. The PATCH support is WIP.

Use cases:

There are two ways to register such devices:

  1. Using a third-party script hosted elsewhere (e.g. gateway). In this case, the TTL may be set and TD keepalive routine may be handled by the script. But lifecycle should corresponds to the actual thing rather than the script.

  2. Using the UI of the Thing Directory by a human operator. The operator is expected to remove the registration when the thing is no longer available. This may also work with a long TTL, enforcing the operator to visit the UI and press some kind of validation button. But allowing long TTL defeats the purpose of TTL and may be misinterpreted by developers and misused on machines.

mmccool commented 4 years ago

Additional use case:

Additional questions:

mmccool commented 4 years ago

Regarding upper bound, we can look at related protocols like DNS. We could make it configurable in the directory, with the only requirement being that it is finite. We could also make it optional, but require configurable support for it in any WoT directory service. I personally think that if a registration from a device does not request a TTL, there should be a (configurable) default supplied by the directory service.

mmccool commented 4 years ago
mmccool commented 4 years ago

Who checks if a device is alive?

  1. Device checks in
  2. Directory service pokes device

Some devices probably can't do 1. Should it be negotiable (select when it is registered).

For 2, the "poke" mechanism may be different in different protocols, and also will run into problems with sleeping devices. We may also want to check on the status of devices indirectly, eg. by talking to a hub that knows about the device. To handle this, we have a similar situation as protocol bindings... maybe we can reuse that somehow?

Conclusion:

farshidtz commented 4 years ago

Edited to provide better explanation and examples:

  • If created and modified times in the TD are used to manage when the TD was last updated, is the directory responsible for adding these?

Yes, the idea was to set both created and modified by the directory during first registration, and modified during updates.

  • To avoid conflicting with device usages of these fields, maybe we need a new field or two ("firstRegistered" and "registrationUpdated") - we would want to create a PR/issue in wot-thing-directory to include this

Agreed, they shouldn't be used if they are meant for e.g. the manufacturing time. I think TTL is a useful attribute even out of the context of directory to inform the consumers about the validity of a TD for self-describing devices. But for other directory related attributes, why not extend TD's context and add such new attributes in the directory only, instead of of adding them to TD's specs?

I suggest adding registered and updated fields inside an object (similar to how ld-proofs are added) and present them on-demand (e.g. toggled with a boolean query parameter):

{
    // TD fields
    ...
    "ttl": ..., // Set by the registrator; for directory to know when to purge the TD in the absence of a keep alive notice; and for consumers to know when the TD expires to ask the directory for a new version.
    "proof": ...,
    "directory": {
        "registered": ...,
        "updated": ...
    }
}

  • The other option would be to have out-of-band information; the information model for the directory could be as it if was:
{
    "registered": ...,
    "updated": ...,
    "td": { TD },
    "proof": ...
}
  • For out-of-band option, note that not modifying the TD itself to store this information has advantages for "signed" TDs: we would not have to update the signature provided by the device, if there is one. The directory could still have its own signature block wrapping everything ("proof" in the above) if we really really wanted.

In my opinion, adding out-of-band information in a wrapper object adds some complications:


Another option is to have two API endpoints, one for raw TDs and another for directory-specific entries. TDs and directory entries may use the same IDs.

E.g.:

/td/{id}
{
     // TD fields
    ...
    "ttl": ..., // Set by the registrator; for directory to know when to purge the TD in the absence of a keep alive notice; and for consumers to know when the TD expires to ask the directory for a new version.
}
/directory/{id}
{
    "id" : "TD's id",       
    "registered": ...,
    "updated": ...,
    "proof": ..., // ?
    "td": "Link to TD entry in directory"
}
farshidtz commented 4 years ago

Who checks if a device is alive?

  1. Device checks in
  2. Directory service pokes device

Some devices probably can't do 1. Should it be negotiable (select when it is registered).

For 2, the "poke" mechanism may be different in different protocols, and also will run into problems with sleeping devices. We may also want to check on the status of devices indirectly, eg. by talking to a hub that knows about the device. To handle this, we have a similar situation as protocol bindings... maybe we can reuse that somehow?

Conclusion:

  • We can spec an API for 1 easily, so let's at least do that.
  • Supporting 2 might be tricky. Maybe we can just say that if people need to do this, a separate service may be required that checks on devices (using protocol-specific information) and updates the TTL in the directory on their behalf.

For 2, the directory may support some active health check mechanisms for devices that describe such affordances. Devices may provide a "health" property affordance in their TD, such that the directory can query once a while and get a 200 OK (HTTP) or an event affordance which the directory can subscribe to (MQTT, websocket, long polling) and know the device is active. This information can be represented as lastActive or lastSeen time attributes in the directory.

benfrancis commented 4 years ago

Use cases:

  • Sensors that are incapable of registering themselves E.g. a zwave smart plug with no IP networking capabilities
  • WoT non-compliant things E.g. a proprietary, closed z-wave gateway device

Is it really expected that devices would register themselves with a directory? How would that work?

I had assumed that a WoT client would register a WoT device with a WoT directory or that a WoT gateway would generate web things itself for the devices it is managing (using other protocols) and expose them as a WoT directory.

I can see a use case for an expiry date for a TD for the purposes of caching and updating the TD, but I would strongly recommend that is left to the protocol layer (as explained in https://github.com/w3c/wot-thing-description/issues/916#issuecomment-645954223). Both HTTP and CoAP have built-in caching models that you probably don't want to interfere with.

Device is deinstalled by user or fails. In this case TTL ensures the directory is (eventually) cleaned up.

Why would the TD not just be immediately removed from the directory when the user un-installs it? What use case do you have in mind?

I am unconvinced by the idea of a device just disappearing from a directory if it doesn't check in for a period of time. Under what circumstances would you want this? The WebThings Gateway has a ping mechanism to know whether devices it manages are currently online or not, but it never just removes a device without being asked to do so by the user. Have you come across real world use cases where you would want to this to happen automatically? If so, could that not be an implementation-specific decision based on use case? A directory could internally keep track of the last time it was able to contact a device and hide or delete inactive devices. In that case TTL would be an internal setting of a directory rather than a property of an individual web thing.

benfrancis commented 4 years ago

For 2, the directory may support some active health check mechanisms for devices that describe such affordances. Devices may provide a "health" property affordance in their TD, such that the directory can query once a while and get a 200 OK (HTTP) or an event affordance which the directory can subscribe to (MQTT, websocket, long polling) and know the device is active. This information can be represented as lastActive or lastSeen time attributes in the directory.

Something like this could be a good idea. If this information is exposed by a web thing directly though, is there actually an advantage to the directory re-exposing this information itself vs. a WoT client just asking a WoT device for this information directly? It's not yet clear to me whether directories should:

  1. Link to the Thing Descriptions of web things
  2. Re-publish the Thing Descriptions of web things at a new URL (at which point the directory is more of a gateway/proxy than a directory, bridging a web thing from one domain to another)

These might be two separate use cases, the former being a "directory" (which links to Thing Descriptions of web things) and the latter being a "gateway" (which bridges web things to another domain and exposes them through its own directory, like the web thing adapter of the WebThings Gateway currently does). A gateway being a superset of the features of a directory.

Regarding pinging for an active state, also note that some battery powered devices using low power wireless protocols like Zigbee and Z-Wave spend a lot of time in sleep mode to conserve power and only wake up ocassionally or when a sensor is triggered, in order to check in with a controller or send a sensor reading. If you're pinging a web thing which is being bridged from another protocol by a gateway, you might be able to get a response from the gateway but it may not be able to contact the device itself to verify it is online. The gateway might know when it last heard from the device though, which could be useful information to pass on.

egekorkan commented 4 years ago

Re-publish the Thing Descriptions of web things at a new URL (at which point the directory is more of a gateway/proxy than a directory, bridging a web thing from one domain to another)

Not necessarily, the TD Directory is just storing TDs. It is not stated in any W3C WoT spec that a Web Thing or even a Gateway stores the TDs of the Things. So it can really well be the only entity publishing TDs, and thus not republishing.

benfrancis commented 4 years ago

See https://w3c.github.io/web-thing-protocol/requirements/#wot-gateways-directories for my current thinking on the definition of "WoT gateway" vs. "WoT directory"

relu91 commented 4 years ago

Why would the TD not just be immediately removed from the directory when the user un-installs it? What use case do you have in mind? I am unconvinced by the idea of a device just disappearing from a directory if it doesn't check in for a period of time. Under what circumstances would you want this? The WebThings Gateway has a ping mechanism to know whether devices it manages are currently online or not, but it never just removes a device without being asked to do so by the user. Have you come across real world use cases where you would want to this to happen automatically? If so, could that not be an implementation-specific decision based on use case? A directory could internally keep track of the last time it was able to contact a device and hide or delete inactive devices. In that case TTL would be an internal setting of a directory rather than a property of an individual web thing.

On the contrary, I see the TTL used as a mechanism to obtaining the "eventual consistency" of the TDir ( btw @farshidtz nice talk about how we should abbreviate the Thing Directory) quite useful. As always, it is a trade-off; having the TDir actively checking the status of each registered device might not scale up very well and possibly flood the network with "health check" requests. Considering that most of the time devices are connected using wireless networks we might end up polluting the communication with unwanted packages. Instead the TTL is a cheap mechanism that can be exploited by a low-powered device to be sure that the system is consistent even if the die (i.e. ops low battery 😄 ).

A use case that I have in mind, as hinted above, is a device that registers itself in the TDir and then schedules an update of the TTL every 1 month (in our lab we have devices that last quite long). Consider that for whatever reason the device consumes the battery faster and dies before the next update, the TDir gets cleaned up and one could infer that the device is broken.

What do you think?

benfrancis commented 4 years ago

Not necessarily, the TD Directory is just storing TDs. It is not stated in any W3C WoT spec that a Web Thing or even a Gateway stores the TDs of the Things. So it can really well be the only entity publishing TDs, and thus not republishing.

If a device does not have a Thing Description accessible at a URL on the web, then it is surely by definition not a web thing, it is just a thing. So either a device is serving its own Thing Description (which can be linked to by a directory) or its Thing Description is being served by an intermediary, which I would call a gateway since it is bridging a non-WoT device to the Web of Things.

egekorkan commented 4 years ago

If a device does not have a Thing Description accessible at a URL on the web, then it is surely by definition not a web thing, it is just a thing.

I do not agree with this. Philips HUE devices are connected to the Hue Bridge which has the HTTP server running and connection to a router. I can write a TD for a Philips Hue light and serve it anywhere I want, GitHub, my computer, TDir etc.. Then a script can consume this TD and interact with the Hue lights. In this case, TDir is definitely not an intermediary.

benfrancis commented 4 years ago

I do not agree with this. Philips HUE devices are connected to the Hue Bridge which has the HTTP server running and connection to a router. I can write a TD for a Philips Hue light and serve it anywhere I want, GitHub, my computer, TDir etc.. Then a script can consume this TD and interact with the Hue lights. In this case, TDir is definitely not an intermediary.

Ah OK, yes I can see why you would not consider that a gateway. FWIW, Mozilla has a Philips Hue adapter which does act as a gateway by bridging the Hue REST API to the Web Thing API at the gateway's own domain, but that's different to what you're describing.

It sounds like there are three separate functions here. TTL might work differently in each of these cases, so it would be useful to first have a consensus on the definition of a Thing Directory so that we are all talking about the same thing. I've filed #32 to discuss that.

mmccool commented 3 years ago

Summary of some discussion points is needed. We need to do a sort. @farshidtz has volunteered to extract discussion topics related to the information model (and alternative design options, and their pros and cons) and put it in issue https://github.com/w3c/wot-discovery/issues/98

mmccool commented 3 years ago

This actually has three issues, two of which have been resolved:

  1. In-band vs. out-of-band metadata. We have decided to go with in-band (enriched TDs)
  2. How do we encode TTL etc. (done). This also covers keep-alive, but device needs to push periodically.
  3. Pull (webhook) keep-alive mechanism. Related to polling (see issue #164), still open. Can rename to capture "pull keepalive" use case.

Propose closing this issue to clean things up, and move discussion of remaining "pull keepalive" mechanism to issue #164.