w3c / wot-thing-description

Web of Things (WoT) Thing Description
http://w3c.github.io/wot-thing-description/
Other
131 stars 63 forks source link

Reconsider `id` being optional #2054

Open JKRhb opened 3 weeks ago

JKRhb commented 3 weeks ago

While the id field in a TD has been mandatory in the early phases of the specification work (see https://github.com/w3c/wot-thing-description/issues/142), it has been changed to being optional for privacy reasons (see https://github.com/w3c/wot-thing-description/issues/794 and https://github.com/w3c/wot-thing-description/pull/820) before the publication of TD version 1.0.

While I understand the rationale for this decision, it is a bit annoying from a developer's point of view that you cannot be sure that there is going to be an id present in a TD, which would be very useful for state management. You could potentially use the title or the base as a fallback mechanism, but since these do not have to be unique, this is not really ideal for keeping track of a device. In an implementation I am working on at the moment, I currently just filter out TDs that do not have an ID, and it would be nice if that wasn't necessary.

While the privacy concerns that have been brought forward to motivate the id field being optional are valid, I think there are enough assertions in place by now (for example via https://github.com/w3c/wot-thing-description/pull/825) to be able to revert this decision in TD 2.0, as the id field is not meant to be a permanent identifier anymore.

Potential User Story

As a developer, I would like to be sure that every device has an identifier present in its TD so that I can use it for state management.

Potential Use Case

In a smart home domain, a user wants to control their devices via an app and performs a discovery mechanism. In order to prevent a device from appearing in the list of discovered devices twice, it should have an identifier (that should be regenerated on reset or even on restart).

(Currently, this also reads a bit like a user story, so there is probably some refinement needed.)

egekorkan commented 3 weeks ago

From the Discovery and TD management perspective, I agree with you. We should talk with PING before making any changes. Regarding:

it has been changed to being optional for privacy reasons (see https://github.com/w3c/wot-thing-description/issues/794 and https://github.com/w3c/wot-thing-description/pull/820) before the publication of TD version 1.0.

This is a nicer way to put it, but we reopened the CR process, and TD 1.0 had two CRs (see https://www.w3.org/TR/2019/CR-wot-thing-description-20190516/ and https://www.w3.org/TR/2019/CR-wot-thing-description-20191106/), which added a nice 6 months of delay. We should avoid such a thing from happening again :)

lu-zero commented 3 weeks ago

Once a device is part of a network it does have a local identifier that's unique within the network, one way or another. Probably this should be considered in light of possible onboarding mechanisms (and that's yet another topic...)

I guess we can copy/link the best practices suggested regarding mac address management e.g.:

egekorkan commented 3 weeks ago

The issue is more about being able to track it no matter the network etc. If the Thing does not change its id (we cannot mandate that), it would be possible to track the device and its user throughout its lifecycle. Adding something like "the Thing should manage its id" was not strong enough, thus we had to remove it.

lu-zero commented 3 weeks ago

There are use cases in which you do want having a persistent, unique, id.

There are other that would prefer to have it quasi-randomized to make harder to track since it could be a wearable or such.

I dare to say we have more devices of the former kind than the latter. (e.g. all the industrial and agricultural fields)

egekorkan commented 3 weeks ago

I am not arguing about not having a use case nor whether it makes sense or not. It is more about making it mandatory, which results in the possible poor management of the id by different implementations, which has privacy concerns according to PING review.

I would actually vote for making it mandatory but somehow writing enough mechanisms around it to make the device and its user protected from privacy attacks etc.

lu-zero commented 3 weeks ago

It is an out of box interoperability issue, and that means some binding to model the behavior and a profile to pin it if we had infinite resources :/

JKRhb commented 3 weeks ago

Maybe for TD 2.0, the description of the id could be adjusted to better reflect that it is supposed to be a non-permanent/temporary identifier. Additionally, there could be an additional example of a TD with a permanent identifier that is only accessible via a (protected) property, to make it clearer what the best practice is supposed to be here.

benfrancis commented 3 weeks ago

I've always thought that the Thing Description URL should be the default identifier of a Thing.

But assuming a lack of consensus on that point, I have to reluctantly agree that if there is an id member it should be mandatory.

Exactly the same problem with an optional id member exists in the W3C Web App Manifest specification, where it's causing all kinds of issues.

FWIW I don't really see the privacy issues as being a big problem. In the rare cases that it's an issue, all that's really necessary is to reset the ID on a factory reset (or equivalent) of a device.

danielpeintner commented 3 weeks ago

In order to prevent a device from appearing in the list of discovered devices twice, it should have an identifier

I am not arguing against your arguments but I always thought we should have a canonical TD form that allows us to compare TDs. Maybe this would solve your problem also... not sure.

JKRhb commented 3 weeks ago

In order to prevent a device from appearing in the list of discovered devices twice, it should have an identifier

I am not arguing against your arguments but I always thought we should have a canonical TD form that allows us to compare TDs. Maybe this would solve your problem also... not sure.

Yeah, I also had to think about that :) However, there might also be the case where a Thing alters its TD for some reason (maybe a certain feature got activated, leading to the inclusion of an additional property). In that case, the Thing might keep the same ID, while the TD comparison would yield a different result, so this solution does not work as a fallback in all cases.

relu91 commented 3 weeks ago

I don't have an answer but I see why having an id makes life easier for quite a few use cases. Another one that has not been mentioned is Schema caching. As explained in this long node-wot issue (particularly this comment) it would be helpful to have an id defined so that we could cache all the processed JSON Schemas for future interactions. It is probably a corner case, but anyhow it exists.

Also, as one of the implementers of the Discovery spec in Zion, I found the handling of anonymous TDs pretty cumbersome. For example everytime you need to return a TD (even when you list them) you have to postprocess them using this function. Again not a big deal, but still unconvient.

hspaay commented 1 week ago

Just adding to this sentiment. hiveot is a hub for digital twins. You can't have a digital twin without an identifier. Therefore hiveot cannot support TDs without an ID.

Maybe this is the wrong solution for the 'privacy' argument. Plenty of devices generate a unique ID on a device factory reset. This is a better solution IMHO.

egekorkan commented 1 week ago

I think there is an overwhelming amount of requests to make the id mandatory. Given that this is a rather delicate point to bring back, I propose a discussion in the TD meeting after WoT Week. If we all agree, I would involve PING before making any changes. I even foresee some WG resolution to put this into the spec as messing this up will again have the risk of breaking the review period schedule when the REC publication process starts.