thin-edge / thin-edge.io

The open edge framework for lightweight IoT devices
https://thin-edge.io
Apache License 2.0
219 stars 54 forks source link

Define the behaviour of overwriting registration messages #2984

Open rina23q opened 2 months ago

rina23q commented 2 months ago

Is your feature improvement request related to a problem? Please describe. We have a representation of a device as a retained message. Since it's a retained MQTT message, user can overwrite the message payload by sending to the same MQTT topic with a retain flag.

Given that user published this registration message to te/device/child01//

tedge mqtt pub -r 'te/device/child01//' '{"@type":"child-device","@id":"child01"}' 

Afterwards, if user re-publishes the following messages to the same MQTT topic, how thin-edge should behave?

  1. The same payload except @type. The new message has different @type.
    tedge mqtt pub -r 'te/device/child01//' '{"@type":"service,"@id":"child01"}' 
  2. The same payload except @id. The new message has different @id.
    tedge mqtt pub -r 'te/device/child01//' '{"@type":"child-device,"@id":"child01-2"}' 
  3. The new message payload has only new info.
    tedge mqtt pub -r 'te/device/child01//' '{"@health":"device/child01/service/something"}' 

The current behaviour (main branch as of commit 6fc5699c7f027a6bd1e3dd1f3cfd178fe66ab200) together with tedge-mapper-c8y are:

  1. Nothing occurs immediately. However, tedge-mapper-c8y crashed after re-publishing a message with "@type":"child-device". See this comment.
  2. A new child device child01-2 is created.
    [te/device/child01//] {"@type": "child-device","@id": "child01-2"}
    [c8y/s/us] 101,child01-2,child01-2,thin-edge.io-child
  3. Error.
    te/device/child01//] {"@health":"device/child01/service/something"}
    [te/errors] Unexpected error: Invalid entity registration message received on topic: te/device/child01//

There could be more scenarios.

Describe the solution you'd like We have to check if the current behaviour is as expected for such erroneous cases. If not, we have to fix them. Probably some behaviour are not even defined.

Describe alternatives you've considered

Additional context

rina23q commented 2 months ago

P.S. to reproduce the crashed case, try this one.

tedge mqtt pub -r 'te/device/child02//' '{"@type":"child-device","@id":"unique-child"}'
tedge mqtt pub -r 'te/device/child02//' '{"@type":"service","@id":"unique-child"}'
tedge mqtt pub -r 'te/device/child02//' '{"@type":"child-device","@id":"unique-child"}'

I created a new bug report dedicated to this https://github.com/thin-edge/thin-edge.io/issues/2986

didier-wenzek commented 2 months ago

P.S. to reproduce the crashed case, try this one.

tedge mqtt pub -r 'te/device/child02//' '{"@type":"child-device","@id":"unique-child"}'
tedge mqtt pub -r 'te/device/child02//' '{"@type":"service","@id":"unique-child"}'
tedge mqtt pub -r 'te/device/child02//' '{"@type":"child-device","@id":"unique-child"}'

First, this is definitely a bug: the mapper should never crash on reception of an expected or ill-formed message.

Probably some behavior are not even defined.

I agree the specifications for the entity store and entity registration are incomplete.

For instance, we have to say that an entity cannot change its type or id over time. But, if so, we must also provide a mean to de-register an entity, so a user will be able to fix a mistake.

We have a representation of a device as a retained message. Since it's a retained MQTT message, user can overwrite the message payload by sending to the same MQTT topic with a retain flag.

Indeed, using MQTT retained messages as the source of truth for entity registration leads to many issues:

My point of view

We have to:

albinsuresh commented 2 months ago

Fully aligned with your proposal @didier-wenzek. Having a dedicated channel for registration and deregistration requests was a requirement that kinda came up during the de-registration API proposal work as well, but was rejected mainly for the lack of consistency between registration and deregistration APIs. But, I'm also convinced that this is the right way forward for the long term. That eases our desire to make tedge-agent the sole owner of the entity store topics (te/+/+/+/+) and prevent other components from updating these topics directly.

My only concern is backward compatibility. Since we can't break backward compatibility so soon, we'll have to continue supporting registration messages published to the direct topics as well. What we can do to make the transition less painful is by making tedge-agent continue to accept such direct messages only for new entities. If someone publishes updates to an existing entity, then we can make the tedge-agent reject any such messages and let it overwrite that erroneous update with its own internal view of the entity metadata.

As far as I know, the recently released availability feature is the only one that encouraged users to publish entity updates to the same entity store topics. Since that's a very recent feature, we can hopefully withdraw that API now and mandate that any @health values must be provided in the initial registration message itself and not as an update.