Closed didier-wenzek closed 1 year ago
Hi!
Our POC implementation is ready.
This is the POC based on the interfaces introduced via #979 plus the "core" implementation and related parts (not yet in a PR).
Here is the tip of the branch that contains everything.
The above plugins are mockups - except the mqtt one and I believe the thin_edge_json one is already final as well.
To test the PR:
git fetch https://github.com/matthiasbeyer/thin-edge.io/ feature/add_tedge_api/showcase
git checkout FETCH_HEAD
cargo build -p tedge-cli --features sm,mqtt,c8y,collectd,thin_edge_json
./target/debug/tedge-cli -l info run ./tedge/example-config-showcase.toml
You can change that "info" in the last line to "debug" to see more output.
You can stop the process using Ctrl-C
.
Please note that if you don't have an MQTT broker running, the application will not start. The current behaviour is that all plugins need to initialize succesfully, which the MQTT plugin will not do without a broker. Pressing Ctrl-C
will cancel this process and you will see an error because the MQTT plugin was not able to connect to an MQTT broker.
If you have your MQTT broker running, you can now
localhost:1883
{"type":"List"}
to see a SM request to list installed software to be triggered(Of course the JSON format here is just a mockup and nothing final)
Here go some points to describe the fulfilled requirements:
git log --graph --oneline 5e358be9aeebd6ffa23dbb2f782049906880a231..FETCH_HEAD
to get a nice graph)PluginBuilder
s are registered at the application. There is no connection setup done between them at this point!You are also welcome to have a look at the individual plugin implementations, although they are of course mostly mockups:
The following is currently not implemented in the showcase, mostly because it is not really interesting for showing the overall scheme:
Of course the diff you're looking at (5e358be9aeebd6ffa23dbb2f782049906880a231..FETCH_HEAD
) is only the part implementing the showcase.
The whole core implementation is a bit more involved and has now been ongoing for about three months (because #979 is a requirement).
The core implementation is not yet in a PR. This PR will of course feature an in-depth explanation on how things work and how things were implemented once it is opened!
The core implementation PR will then of course not contain stuff from this showcase!
Our POC implementation is ready.
Thank you.
Testing
Things work as described.
Walkthrough
How a thin-edge executable can be built as an assemblage of components that have been implemented independently.
main.rs
is a long sequence of plugin registrations, each plugin being provided by external crates.main.rs
and Cargo.toml
can be easily updated to include at compile-time only a sub-set of the plugins. This opens to path to executables optimized for a use-case.^C
handling be moved inside the core or in a plugin?How are addressed the dependencies between the plugins, their instantiation, configuration and connections.
doc
subcommand that display what is expected for a plugin.main.rs
.“collectd”
other as TOML identifier as in plugins.collector
?How are addressed the main internal communication patterns.
How external communications are addressed, notably over MQTT.
paho_mqtt
while there is already an mqtt_channel
crate? If not usable it would be good to know why.Currently not in the showcase
I agree that it makes no sense to have full-fledge features in the showcase. Except for these points related to plugin inter-communication:
ReplySenderFor
.The grand scheme
The core implementation is not yet in a PR. This PR will of course feature an in-depth explanation on how things work and how things were implemented once it is opened!
We have first to agree on a solution.
Thank you for the interesting feedback! Could you maybe expand a bit more on these aspects?
However, this also stresses somehow that writing a plugin is not easy.
I wonder if it makes sense to define the plugin connections dynamically as most of them makes are somehow imposed by the plugin types.
Could you maybe expand a bit more on these aspects?
However, this also stresses somehow that writing a plugin is not easy.
I agree that being easy is subjective. What I'm missing is a mental model / a pattern / a system way to understand the design of the plugins. I don't say there is no such pattern but that I don't see it. Looking the code of the different plugins, I can roughly understand each of them works, but I would have a hard time to fix something. Some ramp up time would help for sure.
I wonder if it makes sense to define the plugin connections dynamically as most of them makes are somehow imposed by the plugin types.
For instance, the c8y
plugin expects a connection to an mqtt
plugin instance to connect the cloud and a connection to the software management plugin to process software updates. Without these connections the c8y
plugin is useless. It will even be broken if connected to plugins of the wrong type. Some plugins can have less strict connections. For instance a logger plugin could consume and log all the messages published by others. With such loose constraints, a dynamic connection might make sense. But when then are type & semantics expectations between peers, the wiring can be dynamic but if controlled by the program not the config.
Here is a proposal using actors but not actix.
This proposal of a tedge_actors
crate is addressing complementary aspects compared to the tedge_api
.
tedge_actors
to be combined with a config along the lines of the tedge_api
crate.tedge_api
?tedge_api
, are missing many key features to stop the actors, catch panicking actors ...The focus is on the definition of actors.
Actor
trait enforce a systematic pattern: a single Input
type, a single Output
type, a single input queue of messages, a single Recipient
for the output.
enum
types.
Macros would help to manage sub message types and handlers.Recipient
trait.
One key missing point on work-in-progress for the POC is to provide examples of such connectors.Reactor
and the Producer
, are used to define the main activities of an actor:
Actor
is unware of all the surrounding parts and mechanisms.
ActorInstance
.ActorInstance
s can be connected at will using their mailbox addresses.ActorInstance
s are inactive - i.e with no background tasks.ActorInstance
s are started returning ActiveActor
handlers.ActiveActor
handlers gives the control to the running actors (stop, wait for termination, ...).git checkout -b didier-wenzek-rfc/tedge_api main
git pull https://github.com/didier-wenzek/thin-edge.io.git rfc/tedge_api
cargo run -p tedge_poc
Measurements can then be sent over MQTT:
tedge mqtt pub 'tedge/measurements' '{"temperature": 12.0 }'
and the results observed over MQTT:
tedge mqtt sub '#'
tedge_poc
is the target main.rs
c8y_plugin
thin-edge-json-plugin
mqtt_plugin
telemetry_plugin
sm_plugin
apt_plugin
apama_plugin
tedge_actors
POC: request/response as well as 1-n and n-1 connectionstedge_api
and tedge_actors
?Could you maybe expand a bit more on these aspects?
However, this also stresses somehow that writing a plugin is not easy.
I'm not happy with my first response. Sure, being easy or not is subjective. But, I should come with more concrete feedback.
I see 3 layers in the design of an API for plugin/component/actor.
Into
or From
translators from one type of messages to another.select!
) should be addressed by the runtime, not the plugins. Similarly, state ownership and immutability should be addressed by the runtime. I acknowledge that a plugin that has to handle an external event source (say a TCP connection) might have to manage internal mutability and interactions between this external source and the requests received from the API. I tried to address this in the tedge_actor
proposal with two different traits for the two major behaviors of an actor (reacting to messages or producing spontaneous messages) - but this proposal is not battle-tested yet.tedge_api
is by far more complex than the tedge_actors
runtime but I don't see that as a major concern. What matters though is that the runtime can be improved without having to rewrite all the plugins. A key test for the tedge_actor
API for instance will be to add termination control of the plugins without changing the latter. I hope this second answer is more helpful.
After re-reading the messages in this thread, I have to add some more notes on our proposal.
But first, I want to address some of your questions:
Can ^c handling be moved inside the core or in a plugin?
Technically it definitively can.
We could even think about a bit more elaborate API which allows plugins to define that they want to be notified on Ctrl-C
and tell the core themselves how the signal should be handled. Definitively a point to think about, but (IMHO) out of scope for the first step.
For me the target is being able to build an application as an assemblage of plugins without deep Rust expertise. This expertise should be moved into the framework and the plugins.
Yes and no. Depends on what you mean with "deep Rust expertise". I believe that in all approaches we saw so far, the same things are required for a plugin author to be decently proficient in: Generics, Async Rust, Traits. Without a basic understanding of these three concepts, writing a plugin is not feasible, in either POC we've seen so far - and I believe there won't be one that takes away these requirements!
I have a mix feeling on the configuration file.
- Cons:
- Not so easy to make the relationship with the main.rs.
What do you mean by that?
Why some name are given as string as “collectd” other as TOML identifier as in plugins.collector
That's how toml works. One (in this case "collectd") is a string, the other is a table key (in this case "collector"). The former ("collectd") is the type of the plugin, the other is the name of the instance ("collector").
It’s not obvious to see the graph of plugins.
Yes. @TheNeikos and I already talked about that a lot. Unfortunately, that's a limitation of TOML. We might have some ideas here to provide a graphical config editor for our POC that we might implement during a Hackathon mid-June at our company.
How are addressed the main internal communication patterns. The picture is really not easy to grasp. What kind of communication can be implemented? How? What are the limitations?
So right now we have point-to-point communication. That means 1:1, 1:N and N:1, or in short: N:M ;-) ! That's the very baseline and (so far) has been sufficient for everything we've played with. Of course, you might want more patterns, which is an absolute valid request.
This baseline can be used to implement a more pub/sub style pattern, I like to believe. Request-Response is already included in the baseline via the reply functionality.
Up to recently, the message types were bound to a request type. This seems to be no more the case. Is it?
I'm not sure what you mean by this.
If you mean the associated type Reply
on our Message
trait: We were able to lift that requirement after your feedback and move reply functionality out of the Message
trait itself, making the request-reply pattern more explicit with ReplySenderFor<_>
. We can of course elaborate how that works, if you wish.
Why paho_mqtt while there is already an mqtt_channel crate? If not usable it would be good to know why.
The implementation of the the MQTT plugin in our POC is already a few weeks old. IIRC I took the "paho_mqtt" crate because I didn't like the interface of the "rumqttc" crate at all.
For the "mqtt_channel" crate: I had a quick look at it, but found "paho_mqtt" much simpler to use and at the time, I wanted to implement things quickly. :-) Of course we have to decide on one implementation/libraries, but IMHO this is also out of scope for the POC - or rather just a detail that doesn't matter in the grand scheme of things, if you understand what I mean. Rewriting the MQTT plugin to use rumqttc or mqtt_channel is a matter of one day of effort - nothing to worry about right now, I guess.
It matters to have two sm plugin instances (it can be from the fake plugin). The point is to demonstrate message dispatch.
All I would do for another SM plugin would be cp -r plugins/plugin_sm_apt plugins/plugin_sm_other
and rename "Apt" to "Other"! I can do that, of course, but I think it just increases the code size and does not help with "reviewability".
It matters to handle the responses of the software-management plugin. Even if these are just ping/pong response. I'm a bit surprise that you did nothing here because everything seems to be in place with ReplySenderFor
Yes, you're absolutely right!
I added some code that shows how reply handling is done. In this commit I added code in the "sm_apt" plugin that simply replies with some "InstallSucceeded" message if there is an install request. In this commit I added code in the "mqtt_sm" plugin that takes that response and sends it (serialized as JSON) back to the MQTT plugin, which then publishes the message on the broker.
You can redo
git fetch https://github.com/matthiasbeyer/thin-edge.io/ feature/add_tedge_api/showcase
git checkout FETCH_HEAD
cargo build -p tedge-cli --features sm,mqtt,c8y,collectd,thin_edge_json
./target/debug/tedge-cli -l info run ./tedge/example-config-showcase.toml
and then you can publish {"type":"Install","package_name":"foo"}
on "smrequests", you'll see a reply on "somerandomtopic" which indicates that the package "foo" was installed.
So far for your questions, now some things I want to add:
Just to state this (possibly repeating myself here, sorry) explicitely: If someone wants to implement a new plugin, they have to:
struct MyBuilder;
(yes, not even members!)struct MyPlugin { ...maybe some members... }
tedge_api::PluginBuilder
(with #[async_trait]
) on their Builder
with 3 required functions and one optional onetedge_api::Plugin
(with #[async_trait]
) on their Plugin
. This trait has no required methods (in the showcase it still has, in our current tedge_api
all of those are optional)And that's a new plugin. Now they copy some lines in tedge/src/main.rs
to include their plugin in the application and they're done.
Depending on what they want their plugin to do, they have to do one of the following things (or both):
Plugin::start()
function.tedge_api::Handle<T>
on their Plugin
, where T
is the type of the message the plugin should be able to receiveThat's all. And there can be multiple Handle<T>
implementations if the plugin is able to receive messages of different types.
But that's literally all a plugin author has to do. I think this plays exactly into the ideas you wrote down with
Implementing a component might be more involved but should ideally focus on the feature logic. One of the goals of the plugin API is to ease the interaction of independent streams of events and requests acting on some state. Hence, interaction concerns (e.g select!) should be addressed by the runtime, not the plugins. [...]
as it boils down to three tasks that the plugin author has to do (from a high level):
And then they can start implementing their business logic.
What matters though is that the runtime can be improved without having to rewrite all the plugins.
Yes, this is definitively a valid point. Though stability guarantees should be worked out in a seperate issue, because it is a rather complex topic!
I like to just note that since we've worked quite a bit (four months of two persons fulltime by now) on the API design and its implementation, we are certain that we have reached a decently stable design so far. Some details are still in flux, but nothing that is of major concern right now.
Still, if we need to change the internal communication API, and the project is in 0.x.y state still, I don't think that's an major concern! As soon as we're in 1.x.y state, breaking the communication API is of course not allowed anymore. Even more reason to spend extra time on a decent approach!
As written above in some answer to your questions, 1:1, 1:N and N:1 style messaging is in the POC already!
1:1 messaging is of course easy, but 1:N is also simple: If a plugin wants to send to multiple other plugins, all it needs to do is save the addresses of these other plugins and send to all of these addresses. That's nothing more than for addr in addresses { addr.send(message) }
-style programming, of course - as one would expect.
For N:1-style messaging: As all message handlers are called asynchronously and concurrently, having multiple plugins send to one plugin is already included in the base of the architecture.
NB: We're currently evaluating and defining our response to your proposal of how such an interface could look like.
Goal of this review:
Thin-Edge.io has the opportunity right now to pivot into a different direction from before. To make sure that this new direction fits the goals of the intended users we have initiated a 'call for proposals' between both SoftwareAG and IFM (who are the main contributors right now). I believe that you've made your points clear in the OP of this issue, but I think some points are worth re-iterating:
These two points can be considered equivalent.
With that said, let's dive in.
didier-wenzek:rfc/tedge_api implements a custom 'actor' library composed of these different parts:
An Actor trait with 5 associated types, a sync constructor as well as an async start method
dyn
pointer. Is this intentional? If yes, how would one solve the issue of having dynamic instances of this actor? (Read, more than one, defined at runtime?)Mailbox
es seem in an odd place
Reactor
is supposed to handle them. Since either messages need to carry state or the Reactor somehow needs to keep track of what messages it receives from whom. (With no current visible mechanism to do so)Producer
and Reactor
is its own future and get spawned
individually, how will one share state with the other?
Reactor
takes a &mut
on its react
method, this means that a single type cannot both be Producer
and Reactor
due to mutability in an async context. (One could imagine that RwLock<T>
might implement both for T
but then they cannot run concurrently but must be serialized somehow.)
This means that a complicated actor would need to be split up between multiple types.Producer
/Reactor
types.
Cancellation overall was not addressed, but it is quite important, how would a user (read, developer) in this pattern be sure that they have a chance of shutting down gracefully?
Actor::start
might do the trick. However no central 'we are shutting down' method exists.Actor
and Reactor
are in a 1-1 relationship. This means that actors can only ever have 'one' logical dataflow. The resulting pattern forces one to thus use an enum to handle multiple potential incoming and outgoing messages. However, no clear relationship between the different message types in the enum exists (and none that is enforceable by the compiler).
Actor
'does one thing'. However, this also means that the number of edges (i.e. connections) between Actors, as well as the number of Actors, increases, which the user would have to somehow configure. Either explicitely, which increases the cognitive load, or implicitely which also increases the cognitive load but can also be unintuitive in error cases.I think its hard to formulate a conclusion. There are definitely some things that can be taken away with regard to simplicity, but it does leave open several questions that are IMO fundamental.
I think due to the fact that it is so 'simple', it is not conclusive on its suitability.
I would love to hear from you @didier-wenzek, what you thought is relevant in this POC that we might have missed for as to why you would favor this over an already (mostly) complete implementation.
@TheNeikos I will be as direct as you are, starting with the controversial aspects.
I think due to the fact that it is so 'simple', it is not conclusive on its suitability.
On my side, I think due to the fact that tedge_api
is so out of control for the original team
and so disconnect from what has already been done,
that the associated POC is not conclusive on the ability to have a migration plan.
However, I still hope that we can reach a point of agreement.
I would love to hear from you @didier-wenzek, what you thought is relevant in this POC that we might have missed for as to why you would favor this over an already (mostly) complete implementation.
I think you missed two points:
This is why I started to work on my own proposal.
actix
or tedge_api
.main.rs
and not dynamically from a runtime
config? Sure, dynamic instantiation might be useful but not at any cost. And, my intuition is that most
of the complexity of tedge_api
comes from this requirement. This is also why you
rejected the idea of using actix
.tedge_actor
. There are definitely some things that can be taken away with regard to simplicity, but it does leave open several questions that are IMO fundamental.
tedge_api
,
despite many questions and answers on the PR.
What does it mean "each message is handled in its own execution context"?
What is the context of a message?Sized
constraint on the Actor
type.Finally, a POC has been implemented using a different strategy. Instead of gluing together various partially-implemented actors, we implemented a small number of functional actors, with the aim to go deeper exploring the concrete issues and the ways to solve them.
There are several work-in-progress proposals to rebuild thin-edge on new foundations using plugins:
tedge
APIThe key criteria to commit to one of these proposals, or a combination of them, is to assess that one can smoothly rebuild the current features of thin-edge, progressively moving parts into the new design.
To have concrete criteria to compare the proposals, we will build a POC for each. The point is to demo:
The focus is on the internal plugin API:
Here are the proposed components for the POC (retained because representative of what we have today).
C8Y
plugin implements all the Cumulocity specific features.Collectd
plugin ingests data produced bycollectd
ThinEdgeJson
plugin ingests telemetry data published by external thin-edge MQTT components.SM
plugin manages software management requests.Apt
andApama
plugins are two components with the same interface to "interact with a package manager" but with different targets.A key goal of the new design is to be able to connect components that have been implemented independently while using statically typed messages. This can be achieve using light dependencies around message type definitions, with a crate per plugin.
MqttMessage
type.SMRequest
andSMResponse
types.Apt
andApama
plugins depend on the SM plugin using theSMRequest
andSMResponse
types.Measurement
type must be defined somewhere. I see that as an open question. I propose here to have aplugin_telemetry
crate just to define this type.Collectd
andThinEdgeJson
crates depend both on the mqtt and telemetry crates, but nothing else.C8Y
plugin depends only on thetelemetry
and thesm
plugins. Unaware of apt and apama or any software manager that can be implemented independently. Unaware also of any measurements sources.One of the main benefits of this proposal to move toward plugins is to clarify the dependencies. Here is the nice expected result from this POC.