Immutable Thing Description IDs inside the directory

farshidtz commented 4 years ago

TD spec allows mutable IDs to mitigate privacy risks and tracking [1]. Moreover, it advises against global identifiers created and distributed by a central registry [2].

Making the ID mutable in the directory (as suggested here) enables the directory to notify authorized subscriber about the change. However, it also means that the directory (central registry) is able to track such change, possibly defeating its purpose.

I believe having mutable ID is a requirement for the TD on the device. If the ID must be changed, the TD can be removed from the directory and created as a new entry. Informing consumers about such a change may be carried out through an independent, private channel.

mmccool commented 4 years ago

Observations:

We should distinguish "local ids" (used by the directory to identify records, even if the TD itself does not have an internal id) and "TD ids" (the value of the id field in the TD, which is optional). These may or may not be the same even if the TD has one (under discussion).
Some directories may be trusted. In that case it is useful if the "local ids" are persistent.
If the TD id is used as the local id, then if the TD id is updated then the local id also needs to be updated.
Different viewpoints for ids: right now we have device (TD) and directory (local). Are there others? How does this relate to the lifecycle? What are their lifespan?
Right now, if a Thing wants to "rotate" the local id, it can do so by destroying and recreating the record. In this case there is no direct link between the old and new record. We do not need to define a special procedure.
If the TD id is used as the local id, then of course to "rotate" the local id the TD id also needs to be rotated.
If we distinguish local and TD ids, then it might be possible to register the same TD multiple times, which might not make sense, and at the very least would be troublesome. We could avoid this by having the directory either (a) give an error if the same content already exists (b) dedup. Note that normally (REST design) POST is supposed to create duplicates, PUT is supposed to update and will only create one (will create it if does not exist, will update if it does). Unfortunately option (b) (dedup) on POST would make it idempotent which is against the REST design principles. Let's create an editor's note for this.

benfrancis commented 4 years ago

I again suggest that the URI of a device's Thing Description should be used as its identifier. See https://github.com/w3c/wot-discovery/issues/440

If the directory hosts the Thing Description
- The directory can generate the URI itself when a device is registered
- If the user wants to change the ID they can re-register the device so that a new URI is generated
- If a device is re-registered then as far as the directory and other clients are concerned its a new device and the tracking chain is broken
If the Thing Description is hosted by the device or a third party service
- A client can register the device with a directory by its URI
- If the user wants to change the ID then they can tell the device or the third party service to genereate a new URI
- As far as directories and clients and concerned, a device with a different URI is a different device, thereby breaking the tracking chain

This design may require two different options for registering a Thing Description with a directory depending on whether the Thing Description is already hosted elsewhere:

Register by JSON content - A client registers a Thing Description by providing its content and the directory hosts that content at a URI generated by the directory
Register by URI - A client registers a Thing Description with a directory by providing the URI of its Thing Description, which the directory then retrieves and periodically checks for updates

If we distinguish local and TD ids, then it might be possible to register the same TD multiple times, which might not make sense, and at the very least would be troublesome. We could avoid this by having the directory either (a) give an error if the same content already exists (b) dedup.

Note that having two Thing Descriptions with the same content is probably a valid use case if no id is provided inside the Thing Description (and probably quite common where generic titles are generated by the manufacturer of the device and not edited by the user).

farshidtz commented 4 years ago

I agree with most parts.

Register by URI - A client registers a Thing Description with a directory by providing the URI of its Thing Description, which the directory then retrieves and periodically checks for updates

This is not always practical because:

device may be behind firewalls without open incoming ports. This is very often in IoT settings with a central directory.
the endpoint should be protected and directory needs authorization to retrieve the TD. This makes it hard to scale and maintain security in multi-vendor deployments.

Please follow up in https://github.com/w3c/wot-discovery/issues/34

benfrancis commented 4 years ago

@farshidtz wrote:

device may be behind firewalls without open incoming ports. This is very often in IoT settings with a central directory.

If a Thing Description isn't accessible at a URL on the web then surely it isn't a web thing?

the endpoint should be protected and directory needs authorization to retrieve the TD. This makes it hard to scale and maintain security in multi-vendor deployments.

This is a problem with the security model of TDs in general. See point 1 in Open Issues in https://github.com/w3c/wot-discovery/blob/master/proposals/td_for_tdir.md: "How should a client initially authenticate with a gateway in order to be authorised to get its list of Thing Descriptions? The Thing Description for the gateway can provide security metadata, but a client can only access this metadata if it is already authorised to fetch the Thing Description, rendering it useless. Could the gateway/directory initially serve a stripped-down Thing Description which only contains security metadata?"

Note that in putting forward a use case of adding a web thing by its URL, I am specifically talking about a web thing which has an existing Thing Description hosted at a URL on the web. If a device doesn't already have a Thing Description hosted elsewhere (i.e. isn't yet a web thing), then the other method may be used to add a Thing Description by its content, which is then hosted at a new URL by the directory.

mmccool commented 3 years ago

Resolution: will not support an explicit ID mutation operation, but if a Thing wants to allocate a new id it can delete its TDD entry and recreate it. This works for both anonymous and non-anonymous TDs (in the first case, TDDs should never reuse previously generated ids). This does not quite cover the use case of letting "authorized" parties know about the change but if this is critical it can be added as an event to the Things themselves. There might be other possible workarounds (e.g. redirect links added the old TD) but let's propose these in separate issues. We also need to discuss use cases (e.g. breaking tracking to improve privacy) and these workarounds need to actually address the problem identified in such use cases.

w3c / wot-discovery

Immutable Thing Description IDs inside the directory #39