New StdIO strategy - Githubissues

fredilarsen commented 5 years ago

I have started on implementing this strategy. I add an issue for it here to have a place to describe what I plan, and to exchange opinions and see what other users think, then revise the the design if smart ideas pop up.

Functional requirements:

Allow two processes to communicate using stdin+stdout in a simple manner using PJON with this strategy in both processes.
Allow one process in any language, not using PJON, to communicate using stdin+stdout with a process running PJON with this strategy.
Support a set of interchangeable "dialects" to encapsulate the packets to be sent and received. One dialect can be simply writing "START_PACKET <to id> <from id> <payload>STOP_PACKET", easy to implement in any language. Others can include CRC and more robust and complicated methods (COBS/SFSP) that still can be implemented in any language. Then one version can use native PJON binary packet format.
Supply a set of small send+receive functions for multiple languages including java, python, C, C++ and so on, to have a set of tested functions and reduce the amount of code/bugs to write
Have one "child process" version of the strategy where a command line is specified to the strategy. The strategy will then start the specified program as a subprocess and redirect its stdin+stdout so that the process that is started can simply use stdin+stdout, while the strategy restarts the task if it stops. This allows a non-PJON program in any language to be used by a PJON program, for example to communicate on another protocol.

Preliminary classes: ThroughStdioBase (base class used by all the others to make them interchangeable) ThroughStdioCOBS ThroughStdioPJONNative ThroughStdioStartStop ThroughStdioSubProcess

Use cases:

Terminology:

PROGRAM is the command line for starting a non-PJON program in any language, using stdin+stout commands. This program can be a java program being an MQTT client, a python program being a grpc client, or whatever you like.
SWITCH is the command line for starting a PJON program routing between the StdIO and LocalUDP strategies, connected to PJON devices on the LocalUDP side (LocalUDP can of course be replaced with any strategy available on the given OS).
SWITCH_SUBPROCESS is the command line for starting a program like the switch but running the subprocess version of the StdIO strategy

Started from command line:

Unidirectional transfer to PJON devices: "PROGRAM | SWITCH"

Unidirectional transfer from PJON devices: "SWITCH | PROGRAM"

Bidirectional transfer on Linux (somewhat complex, there are alternatives like socat, coproc, dpipe, etc): "mkfifi fifo0 fifo1" "PROGRAM > fifo0 < fifo1 &" "SWITCH < fifo0 > fifo1"

or using socat: "socat EXEC:PROGRAM EXEC:SWITCH"

Bidirectional using two unidirectional PJON switches: "SWITCH1 | PROGRAM | SWITCH2"

Bidirectional letting the PJON switch run a subprocess: "SWITCH_SUBPROCESS PROGRAM"

Please add your thoughts. You can still influence the design.

gioblu commented 5 years ago

Ciao @fredilarsen, thank you very much for your support and compliments the stdIO strategy is extremely useful. I took a little time to think about point 3:

Start and stop I think may be flawed if not handling start or stop symbols occurrence in data
Not sure how the strategy could be PJON native, being PJON layer 3 and not handling frame separation, that is implemented in the strategies
Maybe optionally CRC32 could be enabled

fredilarsen commented 5 years ago

In the simplistic start/stop packet example, "START_PACKET <to id> <from id> <payload>STOP_PACKET", I did include both receiver and sender PJON ids, which would allow the process to decide its own id, or even act as multiple devices or be a router to a collection of other ids on another type of bus/protocol. The StdIO strategy could act as if connected to a bus and not a single device-as-a-process, allowing a switch to detect which ids are present and not present through it.

We can discuss whether to implement ThroughStdioStartStop at all, but the beauty of it is the minimal effort needed to use it from any language. The START_PACKET/STOP_PACKET magic words could be changed to something more unlikely to appear in data (like PJON_PACKET_START), and it should be prohibited to use these in data to be sent. The length of the data could be added near the start keyword to provide an elementary integrity check which I think would be sufficient. Transfer through anonymous pipes / local sockets is not likely to get bit errors.

Anyhow I agree about the added robustness of COBS, so having COBS-based send/receive functions for a lot of languages would probably remove the need for the start/stop based strategy.

The problem with CRC32 is whether the exact same number can be calculated in any language without having to spend time porting and verifying the algorithm repeatedly. This, combined with the minimal chance of bit errors on anonymous pipe transfer make me think it should be avoided, except in the ThroughStdioPJONNative strategy which could let two PJON processes talk together if there is a need for this. PJON being single-threaded could perhaps gain parallelism by having one process utilize the pipe buffers and multiple PJON processes to increase routing bandwidth.

A preliminary decision could be to start with COBS without CRC support,, in ThroughStdioCOBS first and ThroughStdioSubProcess later (process start+stop detection+restart+pipe redirection takes a little work to get robust).

gioblu commented 5 years ago

Ciao @fredilarsen about SFSP, will stdIO handle in average bigger packets than other strategies in simpler embedded systems? If so, being segmentation still not provided SFSP may really be more handy than COBS, not imposing up to 255 - (stdIO + PJON overhead) bytes per packet limit.

I agree that probably crc may be an overkill.

About the start and stop for sure is a solution to add the length to be completely safe if sending arbitrary binary data as ETCP or TS are doing already.

Girgitt commented 5 years ago

This strategy sounds like a substitute to PJON-piper which is a simple stdin/out proxy to PJON compiled with throighSerial strategy. BTW has been working without a glitch for about a year in an embedded system as a subprocess maintained by PJON-python linked in an upper layer with pub/sub over Redis. PJON-piper has some primitive examples of extremely simple two-threaded (thread dedicated to listening on the bus) implementation that works on Windows and Rasbian.

gioblu commented 5 years ago

Ciao @fredilarsen this link is extremely interesting: https://ocaml.github.io/ocamlunix/pipes.html

In section 5.6 Input/output multiplexing:

Two repeaters may try to write a message at the same time and the Unix kernel does not guarantee the atomicity of writes, i.e. that they are performed in a single uninterruptible operation. Thus the kernel may choose to write only a part of a message from a repeater to /dev/ttya, then write a full message from another repeater and finally write the remaining part of the first message. This will utterly confuse the demultiplexer on the client: it will interpret the second message as part of the data of the first and then interpret the rest of the data as a new message header.

If there is any interest to support multiplexing this may be taken in consideration while choosing the encoding or designing the strategy. If the corruption described above happens both ThroughStdioPJONNative (no framing at all if I understood correct) and ThroughStdioStartStop could not be able to detect the error and would pass over a broken frame. SFSP and COBS should be able to detect the error and abort.

It also describes how to solve it the other way round implementing access arbitration, really cool. Obviously this does not want to be in any way a critic on code that is still not available but just a pointer to a limitation I would have never expected to be there.

fredilarsen commented 5 years ago

In the discussed strategy there would be one pipe in each direction between two processes, and such mixing of messages will not be possible unless letting multiple threads write from one or both of the processes. The atomicity of message transfer will not be guaranteed, this is the way the ETCP is implemented today, reading a variable number of bytes repeatedly until a complete message has been received or the connection is closed, giving an error.

I think the simple START+STOP strategy would actually work sufficiently good, as the risk of data corruption errors in a pipe transfer is extremely small. The highest risk is of the connection being broken, and this would be detected by a missing STOP.

Not saying that the extra safety of SFSP or COBS is wasted, but that the START+STOP could actually do the job with extremely simple encoding/decoding requirements.

gioblu commented 5 years ago

Ciao @fredilarsen I totally agree with you if the point to point constrain is applied there is no need to use complex encoding and start and stop is for sure enough.

gioblu / PJON

New StdIO strategy #249