thin-edge / thin-edge.io

The open edge framework for lightweight IoT devices
https://thin-edge.io
Apache License 2.0
221 stars 54 forks source link

Support configuration of tedge mapper topics per cloud provider #1969

Closed reubenmiller closed 1 year ago

reubenmiller commented 1 year ago

Is your feature request related to a problem? Please describe.

A device is using thin-edge.io to connect to two clouds, however the same telemetry data being pushed from the device (through thin-edge.io) should not be pushed to both clouds.

The goal device architecture is to configure thin-edge.io to use two different cloud providers, and selectively control which types of tedge messages are sent to specific clouds. The split between the tedge measurements would be as follows:

The split of telemetry data is to avoid duplicate costs of consuming/processing the measurements in the Cloud. If the measurements were being pushed to both Azure and Cumulocity IoT, each cloud would have its associated cost (e.g. ingestion, processing, storage etc.)

Describe the solution you'd like

Support configuring which MQTT tedge/ topics each mapper should subscribe to. This would allow each mapper to subscribe to a subset of tedge/ topics, allow the user to control which cloud would receive the data.

The following should be supported:

[c8y]
url = "example.c8y.io"
topics = ["tedge/alarms/+/+", "tedge/alarms/+/+/+", "tedge/health/+", "tedge/health/+/+"]

[az]
topics = ["tedge/measurements", "tedge/measurements/+"]

[aws]
topics = ["tedge/events/+", "tedge/events/+/+"]

Describe alternatives you've considered

Technically the c8y/measurement topic could be removed from the MQTT bridge settings for Cumulocity, however the tedge-mapper-c8y service would still be processing MQTT message published on tedge/measurements unnecessarily (leading to higher CPU usage).

Additional context

reubenmiller commented 1 year ago

I just updated the example toml config to use an array instead of a csv string

rina23q commented 1 year ago

I have a question what we expect users to configure there and how mapper understands them.

Those are the topics that AWS mapper subscribes to.

vec![
    "tedge/measurements",
    "tedge/measurements/+",
    "tedge/health",
    "tedge/health/+",
    "tedge/events/+",
    "tedge/events/+/+",
    "tedge/alarms/+/+",
    "tedge/alarms/+/+/+",
]
// From https://github.com/thin-edge/thin-edge.io/blob/main/crates/extensions/aws_mapper_ext/src/converter.rs#L40-L47

If user configures

[aws]
topics = ["tedge/events"]

then do the mapper understand it as below?

If user configures

[aws]
topics = ["tedge/events/+"]

then mapper ignores tedge/events/+/+?

reubenmiller commented 1 year ago

@rina23q here is a summary about the decisions based on your previous post.

rina23q commented 1 year ago

I opened 3 PRs for this feature. The order to get merged was #2013 -> #2047 -> #2033.

2013: Pure refactoring on c8y/az/aws mappers.

2047: Bug fix on Aws Mapper and Az Mapper. Previously, if user publishes invalid JSON (e.g. only {) to the topic tedge/measurements, both mappers process continues to be up but they won't convert any more messages afterwards (=converter actor died). The PR fixed this issue.

2033: Feature itself. The PR includes the code change, robot tests, and user-guide doc.

The how-to-guide is here. https://thin-edge.github.io/thin-edge.io/next/operate/configuration/config-mapper-mqtt-topics

gligorisaev commented 1 year ago

QA has thoroughly checked the feature and here are the results: