thin-edge / thin-edge.io

The open edge framework for lightweight IoT devices
https://thin-edge.io
Apache License 2.0
219 stars 54 forks source link

POC to evaluate the proposals for a new core design around plugins and actors #1143

Closed didier-wenzek closed 1 year ago

didier-wenzek commented 2 years ago

There are several work-in-progress proposals to rebuild thin-edge on new foundations using plugins:

The key criteria to commit to one of these proposals, or a combination of them, is to assess that one can smoothly rebuild the current features of thin-edge, progressively moving parts into the new design.

To have concrete criteria to compare the proposals, we will build a POC for each. The point is to demo:

The focus is on the internal plugin API:

Here are the proposed components for the POC (retained because representative of what we have today).

 ┌─────────────┐               ┌──────────────────────────────────────────────┐                 ┌──────────────────────────────────────────────┐
 │             │ MqttMessage   │                                              │    SMRequest    │                                              │
 │ MQTT        ├───────────────► C8Y                                          ├─────────────────►  SM                                          │
 │             │               │                                              │                 │                                              │
 │             ◄───────────────┤                                              ◄─────────────────┤                                              │
 │             │  MqttMessage  │                                              │    SMResponse   │                                              │
 │             │               │                                              │                 │                                              │
 └─────────────┘               └───────▲───────────────────────────────▲──────┘                 └────┬───▲────────────────────────────┬───▲────┘
                                       │                               │                             │   │                            │   │
                                       │                               │                             │   │                            │   │
                           Measurement │                   Measurement │                   SMRequest │   │SMResponse         SMRequest│   │ SMResponse
                                       │                               │                             │   │                            │   │
                               ┌───────┴─────┐                 ┌───────┴─────┐                  ┌────▼───┴────┐                  ┌────▼───┴────┐
                               │             │                 │             │                  │             │                  │             │
                               │ Collectd    │                 │ ThinEdgeJSON│                  │  Apt        │                  │ Apama       │
                               │             │                 │             │                  │             │                  │             │
                               │             │                 │             │                  │             │                  │             │
                               │             │                 │             │                  │             │                  │             │
                               │             │                 │             │                  │             │                  │             │       
                               └───────▲─────┘                 └───────▲─────┘                  └─────────────┘                  └─────────────┘       
                                       │                               │
                                       │ MqttMessage                   │ MqttMessage                              
                                       │                               │
                               ┌───────┴─────┐                 ┌───────┴─────┐
                               │             │                 │             │
                               │ MQTT        │                 │ MQTT        │
                               │             │                 │             │
                               │             │                 │             │
                               │             │                 │             │     
                               │             │                 │             │
                               └─────────────┘                 └─────────────┘

A key goal of the new design is to be able to connect components that have been implemented independently while using statically typed messages. This can be achieve using light dependencies around message type definitions, with a crate per plugin.

One of the main benefits of this proposal to move toward plugins is to clarify the dependencies. Here is the nice expected result from this POC.

                 ┌───────────────────────┐     ┌───────────────────────┐    ┌───────────────────────┐
                 │"plugin_c8y"           │     │"plugin_sm"            │    │"plugin_apt"           │
                 │                       ├─────►                       ◄────┤                       │
                 │                       │     │ SMRequest             │    │                       │
                 │                       │     │                       │    │                       │
                 │                       │     │ SMResponse            │    │                       │
                 └───┬───────────────┬───┘     └────────────────▲──────┘    └───────────────────────┘
                     │               │                          │
                     │               │                          │           ┌───────────────────────┐
                     │               │                          │           │"plugin_apama"         │
                     │               │                          └───────────┤                       │
                     │               │                                      │                       │
                     │               │                                      │                       │
                     │               │                                      │                       │
  ┌──────────────────▼────┐     ┌────▼──────────────────┐                   └───────────────────────┘
  │"plugin_mqtt"          │     │"plugin_telemetry"     │
  │                       │     │                       │
  │ MqttMessage           │     │ Measurement           │
  │                       │     │                       │
  │                       │     │                       │
  └───▲────────▲──────────┘     └────────────▲───▲──────┘
      │        │                             │   │
      │   ┌────┼─────────────────────────────┘   │                         ▼
      │   │    └─────────────────────┐           │
      │   │                          │           │
  ┌───┴───┴───────────────┐     ┌────┴───────────┴──────┐
  │"plugin_thinedge_json" │     │"plugin_collectd"      │
  │                       │     │                       │
  │                       │     │                       │
  │                       │     │                       │
  │                       │     │                       │ 
  └───────────────────────┘     └───────────────────────┘
matthiasbeyer commented 2 years ago

Hi!

Our POC implementation is ready.

This is the POC based on the interfaces introduced via #979 plus the "core" implementation and related parts (not yet in a PR).

Contents

Here is the tip of the branch that contains everything.

The above plugins are mockups - except the mqtt one and I believe the thin_edge_json one is already final as well.

Testing

To test the PR:

git fetch https://github.com/matthiasbeyer/thin-edge.io/ feature/add_tedge_api/showcase
git checkout FETCH_HEAD
cargo build -p tedge-cli --features sm,mqtt,c8y,collectd,thin_edge_json
./target/debug/tedge-cli -l info run ./tedge/example-config-showcase.toml

You can change that "info" in the last line to "debug" to see more output.

You can stop the process using Ctrl-C.

Please note that if you don't have an MQTT broker running, the application will not start. The current behaviour is that all plugins need to initialize succesfully, which the MQTT plugin will not do without a broker. Pressing Ctrl-C will cancel this process and you will see an error because the MQTT plugin was not able to connect to an MQTT broker.

If you have your MQTT broker running, you can now

(Of course the JSON format here is just a mockup and nothing final)

Walkthrough

Here go some points to describe the fulfilled requirements:

You are also welcome to have a look at the individual plugin implementations, although they are of course mostly mockups:

Currently not in the showcase

The following is currently not implemented in the showcase, mostly because it is not really interesting for showing the overall scheme:

The grand scheme

Of course the diff you're looking at (5e358be9aeebd6ffa23dbb2f782049906880a231..FETCH_HEAD) is only the part implementing the showcase. The whole core implementation is a bit more involved and has now been ongoing for about three months (because #979 is a requirement).

The core implementation is not yet in a PR. This PR will of course feature an in-depth explanation on how things work and how things were implemented once it is opened!

The core implementation PR will then of course not contain stuff from this showcase!

didier-wenzek commented 2 years ago

Our POC implementation is ready.

Thank you.

Testing

Things work as described.

Walkthrough

How a thin-edge executable can be built as an assemblage of components that have been implemented independently.

How are addressed the dependencies between the plugins, their instantiation, configuration and connections.

How are addressed the main internal communication patterns.

How external communications are addressed, notably over MQTT.

Currently not in the showcase

I agree that it makes no sense to have full-fledge features in the showcase. Except for these points related to plugin inter-communication:

The grand scheme

The core implementation is not yet in a PR. This PR will of course feature an in-depth explanation on how things work and how things were implemented once it is opened!

We have first to agree on a solution.

TheNeikos commented 2 years ago

Thank you for the interesting feedback! Could you maybe expand a bit more on these aspects?

However, this also stresses somehow that writing a plugin is not easy.

I wonder if it makes sense to define the plugin connections dynamically as most of them makes are somehow imposed by the plugin types.

didier-wenzek commented 2 years ago

Could you maybe expand a bit more on these aspects?

However, this also stresses somehow that writing a plugin is not easy.

I agree that being easy is subjective. What I'm missing is a mental model / a pattern / a system way to understand the design of the plugins. I don't say there is no such pattern but that I don't see it. Looking the code of the different plugins, I can roughly understand each of them works, but I would have a hard time to fix something. Some ramp up time would help for sure.

I wonder if it makes sense to define the plugin connections dynamically as most of them makes are somehow imposed by the plugin types.

For instance, the c8y plugin expects a connection to an mqtt plugin instance to connect the cloud and a connection to the software management plugin to process software updates. Without these connections the c8y plugin is useless. It will even be broken if connected to plugins of the wrong type. Some plugins can have less strict connections. For instance a logger plugin could consume and log all the messages published by others. With such loose constraints, a dynamic connection might make sense. But when then are type & semantics expectations between peers, the wiring can be dynamic but if controlled by the program not the config.

didier-wenzek commented 2 years ago

Here is a proposal using actors but not actix.

Content

This proposal of a tedge_actors crate is addressing complementary aspects compared to the tedge_api.

The focus is on the definition of actors.

Testing

git checkout -b didier-wenzek-rfc/tedge_api main
git pull https://github.com/didier-wenzek/thin-edge.io.git rfc/tedge_api
cargo run -p tedge_poc

Measurements can then be sent over MQTT:

tedge mqtt pub 'tedge/measurements' '{"temperature": 12.0 }'

and the results observed over MQTT:

tedge mqtt sub '#'

POC Status

Next Steps

didier-wenzek commented 2 years ago

Could you maybe expand a bit more on these aspects?

However, this also stresses somehow that writing a plugin is not easy.

I'm not happy with my first response. Sure, being easy or not is subjective. But, I should come with more concrete feedback.

I see 3 layers in the design of an API for plugin/component/actor.

  1. Assembling components into an executable should be a straightforward task - selecting plugins, creating and connecting static structs/objects/values, possibly with some glue code as Into or From translators from one type of messages to another.
  2. Implementing a component might be more involved but should ideally focus on the feature logic. One of the goals of the plugin API is to ease the interaction of independent streams of events and requests acting on some state. Hence, interaction concerns (e.g select!) should be addressed by the runtime, not the plugins. Similarly, state ownership and immutability should be addressed by the runtime. I acknowledge that a plugin that has to handle an external event source (say a TCP connection) might have to manage internal mutability and interactions between this external source and the requests received from the API. I tried to address this in the tedge_actor proposal with two different traits for the two major behaviors of an actor (reacting to messages or producing spontaneous messages) - but this proposal is not battle-tested yet.
  3. Complexity needs to be somewhere. It's okay to have the runtime overly complex if this helps to remove complexity from the other layers. The runtime of the tedge_api is by far more complex than the tedge_actors runtime but I don't see that as a major concern. What matters though is that the runtime can be improved without having to rewrite all the plugins. A key test for the tedge_actor API for instance will be to add termination control of the plugins without changing the latter.

I hope this second answer is more helpful.

matthiasbeyer commented 2 years ago

After re-reading the messages in this thread, I have to add some more notes on our proposal.

But first, I want to address some of your questions:

Can ^c handling be moved inside the core or in a plugin?

Technically it definitively can. We could even think about a bit more elaborate API which allows plugins to define that they want to be notified on Ctrl-C and tell the core themselves how the signal should be handled. Definitively a point to think about, but (IMHO) out of scope for the first step.

For me the target is being able to build an application as an assemblage of plugins without deep Rust expertise. This expertise should be moved into the framework and the plugins.

Yes and no. Depends on what you mean with "deep Rust expertise". I believe that in all approaches we saw so far, the same things are required for a plugin author to be decently proficient in: Generics, Async Rust, Traits. Without a basic understanding of these three concepts, writing a plugin is not feasible, in either POC we've seen so far - and I believe there won't be one that takes away these requirements!

I have a mix feeling on the configuration file.

  • Cons:
    • Not so easy to make the relationship with the main.rs.

What do you mean by that?

Why some name are given as string as “collectd” other as TOML identifier as in plugins.collector

That's how toml works. One (in this case "collectd") is a string, the other is a table key (in this case "collector"). The former ("collectd") is the type of the plugin, the other is the name of the instance ("collector").

It’s not obvious to see the graph of plugins.

Yes. @TheNeikos and I already talked about that a lot. Unfortunately, that's a limitation of TOML. We might have some ideas here to provide a graphical config editor for our POC that we might implement during a Hackathon mid-June at our company.

How are addressed the main internal communication patterns. The picture is really not easy to grasp. What kind of communication can be implemented? How? What are the limitations?

So right now we have point-to-point communication. That means 1:1, 1:N and N:1, or in short: N:M ;-) ! That's the very baseline and (so far) has been sufficient for everything we've played with. Of course, you might want more patterns, which is an absolute valid request.

This baseline can be used to implement a more pub/sub style pattern, I like to believe. Request-Response is already included in the baseline via the reply functionality.

Up to recently, the message types were bound to a request type. This seems to be no more the case. Is it?

I'm not sure what you mean by this.

If you mean the associated type Reply on our Message trait: We were able to lift that requirement after your feedback and move reply functionality out of the Message trait itself, making the request-reply pattern more explicit with ReplySenderFor<_>. We can of course elaborate how that works, if you wish.

Why paho_mqtt while there is already an mqtt_channel crate? If not usable it would be good to know why.

The implementation of the the MQTT plugin in our POC is already a few weeks old. IIRC I took the "paho_mqtt" crate because I didn't like the interface of the "rumqttc" crate at all.

For the "mqtt_channel" crate: I had a quick look at it, but found "paho_mqtt" much simpler to use and at the time, I wanted to implement things quickly. :-) Of course we have to decide on one implementation/libraries, but IMHO this is also out of scope for the POC - or rather just a detail that doesn't matter in the grand scheme of things, if you understand what I mean. Rewriting the MQTT plugin to use rumqttc or mqtt_channel is a matter of one day of effort - nothing to worry about right now, I guess.

It matters to have two sm plugin instances (it can be from the fake plugin). The point is to demonstrate message dispatch.

All I would do for another SM plugin would be cp -r plugins/plugin_sm_apt plugins/plugin_sm_other and rename "Apt" to "Other"! I can do that, of course, but I think it just increases the code size and does not help with "reviewability".

It matters to handle the responses of the software-management plugin. Even if these are just ping/pong response. I'm a bit surprise that you did nothing here because everything seems to be in place with ReplySenderFor

Yes, you're absolutely right!

I added some code that shows how reply handling is done. In this commit I added code in the "sm_apt" plugin that simply replies with some "InstallSucceeded" message if there is an install request. In this commit I added code in the "mqtt_sm" plugin that takes that response and sends it (serialized as JSON) back to the MQTT plugin, which then publishes the message on the broker.

You can redo

git fetch https://github.com/matthiasbeyer/thin-edge.io/ feature/add_tedge_api/showcase
git checkout FETCH_HEAD
cargo build -p tedge-cli --features sm,mqtt,c8y,collectd,thin_edge_json
./target/debug/tedge-cli -l info run ./tedge/example-config-showcase.toml

and then you can publish {"type":"Install","package_name":"foo"} on "smrequests", you'll see a reply on "somerandomtopic" which indicates that the package "foo" was installed.


So far for your questions, now some things I want to add:


Just to state this (possibly repeating myself here, sorry) explicitely: If someone wants to implement a new plugin, they have to:

And that's a new plugin. Now they copy some lines in tedge/src/main.rs to include their plugin in the application and they're done.

Depending on what they want their plugin to do, they have to do one of the following things (or both):

That's all. And there can be multiple Handle<T> implementations if the plugin is able to receive messages of different types.

But that's literally all a plugin author has to do. I think this plays exactly into the ideas you wrote down with

Implementing a component might be more involved but should ideally focus on the feature logic. One of the goals of the plugin API is to ease the interaction of independent streams of events and requests acting on some state. Hence, interaction concerns (e.g select!) should be addressed by the runtime, not the plugins. [...]

as it boils down to three tasks that the plugin author has to do (from a high level):

And then they can start implementing their business logic.


What matters though is that the runtime can be improved without having to rewrite all the plugins.

Yes, this is definitively a valid point. Though stability guarantees should be worked out in a seperate issue, because it is a rather complex topic!

I like to just note that since we've worked quite a bit (four months of two persons fulltime by now) on the API design and its implementation, we are certain that we have reached a decently stable design so far. Some details are still in flux, but nothing that is of major concern right now.

Still, if we need to change the internal communication API, and the project is in 0.x.y state still, I don't think that's an major concern! As soon as we're in 1.x.y state, breaking the communication API is of course not allowed anymore. Even more reason to spend extra time on a decent approach!


As written above in some answer to your questions, 1:1, 1:N and N:1 style messaging is in the POC already!

1:1 messaging is of course easy, but 1:N is also simple: If a plugin wants to send to multiple other plugins, all it needs to do is save the addresses of these other plugins and send to all of these addresses. That's nothing more than for addr in addresses { addr.send(message) }-style programming, of course - as one would expect.

For N:1-style messaging: As all message handlers are called asynchronously and concurrently, having multiple plugins send to one plugin is already included in the base of the architecture.


NB: We're currently evaluating and defining our response to your proposal of how such an interface could look like.

TheNeikos commented 2 years ago

Review of Actors POC

Goal of this review:

Thin-Edge.io has the opportunity right now to pivot into a different direction from before. To make sure that this new direction fits the goals of the intended users we have initiated a 'call for proposals' between both SoftwareAG and IFM (who are the main contributors right now). I believe that you've made your points clear in the OP of this issue, but I think some points are worth re-iterating:

These two points can be considered equivalent.

With that said, let's dive in.


didier-wenzek:rfc/tedge_api implements a custom 'actor' library composed of these different parts:


I think its hard to formulate a conclusion. There are definitely some things that can be taken away with regard to simplicity, but it does leave open several questions that are IMO fundamental.

I think due to the fact that it is so 'simple', it is not conclusive on its suitability.

I would love to hear from you @didier-wenzek, what you thought is relevant in this POC that we might have missed for as to why you would favor this over an already (mostly) complete implementation.

didier-wenzek commented 2 years ago

@TheNeikos I will be as direct as you are, starting with the controversial aspects.

I think due to the fact that it is so 'simple', it is not conclusive on its suitability.

On my side, I think due to the fact that tedge_api is so out of control for the original team and so disconnect from what has already been done, that the associated POC is not conclusive on the ability to have a migration plan.

However, I still hope that we can reach a point of agreement.

I would love to hear from you @didier-wenzek, what you thought is relevant in this POC that we might have missed for as to why you would favor this over an already (mostly) complete implementation.

I think you missed two points:

This is why I started to work on my own proposal.

There are definitely some things that can be taken away with regard to simplicity, but it does leave open several questions that are IMO fundamental.

didier-wenzek commented 1 year ago

Finally, a POC has been implemented using a different strategy. Instead of gluing together various partially-implemented actors, we implemented a small number of functional actors, with the aim to go deeper exploring the concrete issues and the ways to solve them.