RFC: Add ThingSet communication support framework

ThingSet / thingset-device-library

ThingSet library for resource-constrained devices written in C/C++

https://thingset.io/thingset-device-library/

Apache License 2.0

14 stars 6 forks source link

RFC: Add ThingSet communication support framework #22

Open b0661 opened 3 years ago

b0661 commented 3 years ago

Request for Comment: Add ThingSet communication support framework.

API for communication buffers (taken from Zephyr's network buffers)
API for communication ports
API for minimal system time support
Communication buffer implementation
- Zephyr. Easy - just net_buf is used.
- Native
System time support for Zephyr and Native
Unit tests for communication buffers (Zephyr and Native)

Changes

Split ThingSet private header into multiple to ease inclusion, addition of more specific headers and maintenance
Rename of thingset_xx.c files to ts_core_xx.c to better name file task and prepare for other generic implementation files. Better naming welcome.
All private headers and implementation files start with ts. Public files start with thingset.

Todo:

[x] Add communication buffer implementation for native
[ ] Add communication buffer implementation for FreeRTOS
[x] Add convenience functions to build a message within a communication buffer
[x] Add convenience functions to scan a message from a communication buffer
[ ] Add port implementation for CAN
[ ] Add port implementation for SERIAL
[ ] Add more unit tests
[ ] ...

Comments/ changes welcome.

Signed-off-by: Bobby Noelte b0661n0e17e@gmail.com

martinjaeger commented 3 years ago

@b0661 this looks great.

I'm currently hiking in the alps with very limited internet access and no laptop ;)

Will have a closer look when I'm back in the office around Sept 16th.

b0661 commented 3 years ago

What about renaming the files to ts_core instead of ts_context?

done

But how will a port be started / initialized?

A port will be started on it's first ts_port_open(). This should also do dynamic initialization. If this does not fit - e.g. the port has to be initialized on device startup - an extra ts_port_init() function has to be added to the port API. In zephyr this function may be called by SYS_INIT(). What do you think?

With something similar to DEVICE_DEFINE?

Zephyr's DEVICE_DEFINE has some hidden functionality that can only be provided by Zephyr. I would stay more generic and just provide the ts_port_init() function in the API and not do Zephyr specific macro and gen_xxx.py magic. The ts_port_init() call should be the task of the application developer in general. A Zephyr specific extension using e.g. SYS_INIT may be provided by the framework.

Currently the different interfaces / ports just start their own thread and call ts_process from it. Now with the shared buffers and abstracted interfaces we'd probably want a different approach.

Why? The ports can still start their own threads. The ports' API functions have to provide thread synchronization - but that should be hidden from the application use of these functions.

And would you mind splitting the commit a bit more, e.g. put the ts_port.h part into a dedicated commit so that one can find the introduction of this new concept by looking at the commit history?

done

b0661 commented 3 years ago

@martinjaeger

Currently the different interfaces / ports just start their own thread and call ts_process from it. Now with the shared buffers and abstracted interfaces we'd probably want a different approach.

I think I did not quite understand what you mean by "different approach". I added a new context that now covers ThingSet data and communication info. You may have a look at it and check whether it fits to your approach.

martinjaeger commented 3 years ago

Currently the different interfaces / ports just start their own thread and call ts_process from it. Now with the shared buffers and abstracted interfaces we'd probably want a different approach.

I think I did not quite understand what you mean by "different approach". I added a new context that now covers ThingSet data and communication info. You may have a look at it and check whether it fits to your approach.

Well, I am not exactly sure about the approach either, but I thought that the current one is not feasible anymore.

Current approach:

Application specifies data to be exposed via ThingSet in a data_objects array and passes it to ThingSet object (singleton).
Ports are implemented in the application and have their own buffers. After reception of data in the thread of a port, the data is passed to the ThingSet object (included via extern) and the processed result is sent back to the comms interface.

New approach:

Exposed data objects are still defined in application and referenced in struct ts_context (or ThingSet class in case of C++).
Ports are implemented in the library (at least for Zephyr) and use a shared pool of buffers.

Now, how is a port enabled, initialized and started? I can see below different options (assuming Zephyr environment):

A port is turned on/off via Kconfig. It expects some predefined Devicetree nodes (e.g. chosen 'thingset,serial' = &uart1) to initialize the Zephyr driver. Each port starts its own thread automatically and runs in the background. The only thing the application has to do is passing the data_objects during startup.
A port implementation can still be turned on/off via Kconfig, but it is not initialized automatically. Instead, the Zephyr application creates a port, assigns it to a hardware interface (e.g. serial device) and starts it. As the thread is part of the library implementation, its stack needs to be allocated dynamically, as we can't determine at compile-time which / how many ports will be started from the application.

We might not even need a dedicated thread for each port anymore. The data could be passed to the buffers from ISRs and after a complete package is received, the ThingSet processing is offloaded using the system work queue.

b0661 commented 3 years ago

Now, how is a port enabled, initialized and started? I can see below different options (assuming Zephyr environment):

A port is turned on/off via Kconfig. It expects some predefined Devicetree nodes (e.g. chosen 'thingset,serial' = &uart1) to initialize the Zephyr driver. Each port starts its own thread automatically and runs in the background. The only thing the application has to do is passing the data_objects during startup.

A port implementation can still be turned on/off via Kconfig, but it is not initialized automatically. Instead, the Zephyr application creates a port, assigns it to a hardware interface (e.g. serial device) and starts it. As the thread is part of the library implementation, its stack needs to be allocated dynamically, as we can't determine at compile-time which / how many ports will be started from the application.

I think it is good to distinguish between static (compile time) assignment and run time assignment.

Prefer compile time over runtime assignment as we are working on resource constraint embedded devices

Also the level of OS abstraction has to be defined.

The core part should be OS agnostic. If OS services are needed for the core part they shall be provided by a ThingSet OS abstraction (e.g. thingset_time.h).
The com part shall be split in an OS agnostic part and in an OS specific part. OS services for the OS agnostic part shall be provided by a ThingSet OS abstraction or by POSIX.1-2017. The POSIX OS abstraction is available for a wide variety of OSes.

Taking the above into account I would like a mixup of your scenarios 1 & 2.

There are two basic communication framework objects, a communication context (struct tsc_context) and a communication port (struct ts_port).
The communication context object shall have a dedicated run() function and shall operate in the OS agnostic space. As such it can only use the ThingSet OSAL and POSIX.1-2017 operating system services. The context´s run() function has to be called by the application. The context´s run() function calls the run() functions of all ports that are assigned to the context.
The communication port object shall also have a dedicated run() function. It may operate in the OS agnostic or the OS specific space.
The init() function of the context and the ports operate in the same way as described for the run() functions.
The application statically configures a ports table (array of struct ts_port) of all ports of a context; just as it statically configures the data objects. A struct ts_port holds also a pointer to the data space for the port. The size of this data is defined by the port type specific data structure and configuration variables. It will also hold the port thread stack if there is one needed. Assignment of devices to the port is statically configured in the port´s configuration. An application may do the static configuration with the help of macros, kconfig, devicetree - whatever is needed and available.
```
TS_DO_DATABASE_DEFINE(test_data_objects,
TS_GROUP(ID_INFO, "info", TS_NO_CALLBACK, ID_ROOT),
TS_ITEM_STRING(0x19, "Manufacturer", manufacturer, 0, ID_INFO, TS_ANY_R, 0)
);
```

TS_PORT_TABLE_DEFINE(test_ports_table, TS_PORT("test", &loopback_simple_api, &loopback_simple_a_config, &loopback_simple_a_data), TS_PORT("com", &loopback_simple_api, &loopback_simple_b_config, &loopback_simple_b_data), );

- The application statically configures the context (all configuration is supported by macros and in case of Zephyr by kconfig and maybe device tree).

TSC_CONTEXT_DEFINE(test_tsc, test_data_objects, test_ports_table);


- The application dynamically calls the context's init() function and on success the context run() function. From then on the context operates independently from the application. Data/msg exchange happens by the shared ThingSet data. Maybe we need context transmit() and receive() functions - this is unclear to me.

EDIT: code examples added

b0661 commented 3 years ago

I understand and I'm fine with most parts of the RFC

That's great.

What is the reason behind splitting the subsets from struct ts_data_object into a dedicated struct? The commit message only states what has been done, but not why.

To make the struct ts_data_object romable the mutable part has to be split out. Looking at the usage of data objects I saw only subsets changed during processing and therefor decided to split it out to keep it in ram.

Also I didn't immediately understand what _do means (e.g. in the file name). If we go that route, I'd prefer _obj.

Will change to _obj for "ThingSet Object".

What exactly is the difference between a port and a communication context.

Ports are the communication interfaces for a ThingSet context. Communication context is - as you noted - somehow misleading. It is a ThingSet context extended by communication features (aka. the ports and communication management). That's why the function prefix is tsc_ - ThingSet context 'ts' extended by communication 'c'. In C++ I would have created a derived class.

In my view a ThingSet context extended by communication features may serve several ports. E.g. a CAN port and an interactive shell port for maintenance access.

And why does it need own functions like tsc_txt_statement, which are already implemented in the core?

Sorry this is a left over from an idea. Will be deleted in a coming update.

I started to experiment an interface for functions that directly work on messages represented by a communication buffer. I did not want to change the current core interface and break compatibility (replacing all memory pointer uses with communication buffer usage). Most probably I will now extend the communication buffers interface with primitives to read and store cbor data.

Do you have a preference/ idea of how such an interface should look like?

Could you maybe provide a code snippet similar to examples/basic that explains how one would initialize and use the new ports and communication abstractions from an application?

Will do. The task of an application will be static configuration of the ports and at runtime the start of the ThingSet context communication processing. The static configuration will be a mixture of compile time configuration using kconfig (or a special header file) and the setup of the port objects table, similar to the ThingSet objects array.

General comment:

Keeping compatibility to a large extend dictates some design decisions. Breaking compatibility to a large would allow other designs:

Directly extending struct ts_context by communication features
Using communication buffers even if no communication framework is selected.

martinjaeger commented 3 years ago

What is the reason behind splitting the subsets from struct ts_data_object into a dedicated struct? The commit message only states what has been done, but not why.

To make the _struct ts_dataobject romable the mutable part has to be split out. Looking at the usage of data objects I saw only subsets changed during processing and therefor decided to split it out to keep it in ram.

Ahh, right, makes sense. Lots of macro magic necessary to do it, but in that case it's probably worth it. Do you know if the FOR_EACH stuff works well with other compilers than GCC?

In the past I had the subsets (called pubsub channels back then) implemented as a dedicated array per subset instead of flags with one bit per subset stored together with the data object. However, it made the initial selection of data objects belonging to one subset error-prone, as you had to populate the array manually with the data object IDs.

See 3d2da2a8e7ae30aec98662951367b8c7d3512bd1 and ffaedc60016e2b930c7c2a2550a399b8110c3e5e for details.

Your approach combines both advantages (ROM storage and convenient assignment of subsets), which is nice. Could you please state the reason behind this change more clearly in the commit message? As it's a generic improvement, maybe also remove the "TS COM framework: prepare" in the commit message, so it's easier to find in the history in the future.

What exactly is the difference between a port and a communication context.

Ports are the communication interfaces for a ThingSet context. Communication context is - as you noted - somehow misleading. It is a ThingSet context extended by communication features (aka. the ports and communication management). That's why the function prefix is tsc_ - ThingSet context 'ts' extended by communication 'c'. In C++ I would have created a derived class.

In my view a ThingSet context extended by communication features may serve several ports. E.g. a CAN port and an interactive shell port for maintenance access.

Could this not be achieved by introducing a pointer to the ts_context in each port (currently ignoring potential implications in multi-threaded systems)? This would change the hierarchies: A ThingSet communication context (struct tsc_context) would not anymore hold the database and the list of ports. Instead we would create the ThingSet core context (struct ts_context) and assign it to multiple ports, which access it like a common database.

I started to experiment an interface for functions that directly work on messages represented by a communication buffer. I did not want to change the current core interface and break compatibility (replacing all memory pointer uses with communication buffer usage). Most probably I will now extend the communication buffers interface with primitives to read and store cbor data.

Do you have a preference/ idea of how such an interface should look like?

Yeah, this is a difficult problem if we want to avoid memcpy operations. I'm not sure how difficult it would be to make the JSMN parser work with communication buffers instead of raw byte arrays. And there are also situations where the data more naturally comes as a simple byte array (e.g. configuration values stored in EEPROM).

Maybe we should accept a bit of memcpy overhead for now and address buffer handling in a future PR so that we don't have too many changes at a time?

General comment:

Keeping compatibility to a large extend dictates some design decisions. Breaking compatibility to a large would allow other designs:

Directly extending _struct tscontext by communication features

Using communication buffers even if no communication framework is selected.

I would actually be fine with some breaking changes if it helps to keep interfaces much cleaner. To my knowledge the library is currently only used in Libre Solar firmware and by three other companies / organizations I've been working with. In all cases the libraries are included via git submodules or similar mechanisms which make sure that the module is not updated accidentally.

So far I have not actively promoted the ThingSet library + protocol because we still identified several things to improve while using it in different environments. Now as the specification is becoming more stable and more people are using it, we could think about introducing SemVer for the library. But before we release v1.0 we can still take some more freedom regarding breaking changes in my opinion.

Extending struct ts_context with communication features would not even necessarily break the interface, it might just slightly increase memory footprint e.g. because of an unused pointer to an array of ports, correct?

b0661 commented 3 years ago

Do you know if the FOR_EACH stuff works well with other compilers than GCC?

I know it works in Zephyr. I is still a beast and I currently have problems with my tests working but my application is complaining at compile time. I will try to get rid of it again. I think I will keep the subset info in ROM (as it was before) and copy it over to RAM at context initialization. All processing will be done on the RAM copy. This brings a slight waste of ROM memory, but will most probably allow to get rid of FOR_EACH and complex, error prone, hard to understand, hard to debug macro usage !-)

Could you please state the reason behind this change more clearly in the commit message?

Shure.

Could this not be achieved by introducing a pointer to the ts_context in each port (currently ignoring potential implications in multi-threaded systems)?

Yes, the context could call the ports init() function and provide the context pointer. The port in turn could access the context afterwards.

In contrary to that I think a port's only interface to the context should be ThingSet messages! More concrete message buffers that contain ThingSet messages.

For example one might be tempted to provide direct context access to a complex MQTT gateway port. By this you restrict the MQTT gateway functionality to just this context. Whereas if the gateway works on ThingSet messages, the messages can originate from and go to also other ThingSet contexts. This is a much more versatile interface (and it fits much better to my mesh concept !-) Also the split of responsibility is quite clear - ports do not have to mangle with ThingSet context internals.

A CAN port would just be forwarding messages between the context and the physical CAN interface. I has to do all the low level CAN stuff but does not have to access the context for ThingSet objects.

Extending struct ts_context with communication features would not even necessarily break the interface, it might just slightly increase memory footprint e.g. because of an unused pointer to an array of ports, correct?

struct ts_context works on memory buffers whereas "struct tsc_context" is designed to work on communication buffers. Also "struct tsc_context" is ROMable. The direct extension of struct ts_context would also require to shift core functions from memory buffers to communication buffers. This seemed a too big step for me.

martinjaeger commented 2 years ago

Hi @b0661. I've just realized that you added quite a few more commits here :)

Can you give an overview of the status of your work? Wow, +29k lines of code is a lot to look into!

As you may have guessed from my other PR, I've started working on ThingSet again as well. I'm also working on a Rust implementation for a cloud backend/agent which would also include some routing functions between different high-level protocols like MQTT and WebSocket. Maybe we should have a chat to deep-dive a bit into the ideas and future visions for ThingSet?

b0661 commented 2 years ago

Hi @martinjaeger

Can you give an overview of the status of your work?

Good question - depends on what part of the communication support framework you are looking at:

OSAL Operating system abstraction layer. As I want to use the library in different environments (Native (Linux), Zephyr, ESP-IDF) I had to introduce some abstractions for basic functionality and build process. Principle OSAL is posix. -> Zephyr: complete, tested, working on native-posix board -> Native: complete, tested, working with some hickups at the shell application - terminal shows escape sequences -> ESP-IDF: not started
Supporting libraries:
- TinyCBOR added for CBOR de-/encoding in messages - this is a submodule -> just works
- JSMN Json decoder - adaptation of JSMN with lower memory footprint -> complete, tested, working
- COBS en/decoder - may be used for serial communication, would enable TXT and BIN protocol on serial lines adaptation of nanocobs -> complete, partially tested, working
- JSON encoder - integrated into message class- complete, tested, working
Main building blocks of the framework
- Message class that encapsulates all data and operations (allocation/ protocol parsing/ protocol build up/ encoding/ decoding/ processing hints) on ThingSet messages -> complete, tested, working
- Things database - the database comes in two flavours: as a local database and as a partial mirror of a remote database. local database -> complete, tested, working partial mirror of remote database -> mostly designed, partially implemented
- Thingset context - comes as core context and as com context - encapsulates all ThingSet related message processing incl. access to ThingSet data objects -> complete, tested, working
- Ports that encapsulate access to communication links
- internal: context<->context. application<->context -> complete, tested, working
- external: context<->communication device -> loopback port is working, hardware communication (e.g. SPI) is in preparation
- Application support to link apllications to a context and enable message exchange -> complete, tested, working
Applications
- ThingSet shell -> some commands added, tested, working in Native and Zephyr environment
Examples Adapted to new com framework -> complete, tested, working
Testing
- Test support -> complete, tested, working in Native and Zephyr environment
- Tests -> incomplete, ~60% coverage, depending on area
Documentation
- Doxygen documentation of functions improved a lot -> still not complete
- Guidance, Introductions, ... -> missing

Maybe we should have a chat to deep-dive a bit into the ideas and future visions for ThingSet?

Good Idea. Just send an email what channel you prefer.

martinjaeger commented 2 years ago

@b0661 That's really impressive! Thanks for the summary.

I've sent you an email to schedule a call. If there is anyone else reading this who would like to join as well, let me know.

jalinei commented 2 years ago

@b0661 That's really impressive! Thanks for the summary.

I've sent you an email to schedule a call. If there is anyone else reading this who would like to join as well, let me know.

Hey, I was lurking around and found this message, I'd love to join, mostly for listening and trying to understand what's going on.

martinjaeger commented 2 years ago

@b0661 That's really impressive! Thanks for the summary. I've sent you an email to schedule a call. If there is anyone else reading this who would like to join as well, let me know.

Hey, I was lurking around and found this message, I'd love to join, mostly for listening and trying to understand what's going on.

We've had the call already last week. But great to hear you are also interested in ThingSet. How are you using it or planning to use it?

jalinei commented 2 years ago

Too bad ! Next time :) We're using it on Zephyr RTOS, for digital power converters applications. So far together with a CAN peripheral, but also through serial. We will explore how we could define a suitable protocol for this specific application. The idea is also to envision deploying TS on a TI micro-controller with a different power module but we're not sure how to do that yet as it doesn't have Zephyr onboard.

b0661 commented 2 years ago

The idea is also to envision deploying TS on a TI micro-controller with a different power module but we're not sure how to do that yet as it doesn't have Zephyr onboard.

@Light411, this branch introduces an operating system abstraction layer (OSAL) for TS. You may have a look at this to adapt to e.g. FreeRTOS or TI-RTOS. FreeRTOS (the ESP32 variant) is on my todo list. OSAL already suppors native Linux (native/thingset/..) and Zephyr (zephyr/thingset/..).