Open bilderbuchi opened 1 year ago
This is a high-level issue, I expect we will farm out details to separate issues later.
A few comments regarding zmq's working:
frames
parameter is just a list of bytes objects (in python termini).Consequences for us:
SET prop1 15.1
) in different frames.Some more ideas regarding content:
b"[['GET', ['property1', 'another property']], ['SET', {'property1': 7}], ['GET', ['property1']]
. In this example I request the values of two properties, I set one of these properties to a specific value and I request the property again.Ideas regarding the command type:
"0x02"
) defeats the purpose of a smaller footprint. In that case, I'd prefer few-letter-abbreviations (enums are helpful) like "S", "G"...Regarding logging:
- yaq uses Apache Avro RPC to serialize data. That could be another possibility.
I read a bit on the subway today, it really looks like a good option; this would also take some of the handshaking, capability listing, reply tracking, message verification (with schema), RPC burden from us. zmq would probably mostly be the transport layer (so one frame per avro message; plus maybe the topic frame). JSON or binary option. Implementations for several languages. Specification here.
Yaq notes on why/how they use avro: https://yeps.yaq.fyi/107/
Here is another description of some message format (of a protocol a colleague recommended, but probably not that good a fit for our use case), could inform us regarding the message structure: https://en.wikipedia.org/wiki/Constrained_Application_Protocol#Message_formats
To answer one question here:
H1: How does a coordinator get the message if the recipient is another Node? Do we also need to give the Coordinator's ID?
In zmq, at least two endpoints of communication need to be defined in a fixed way, i.e. you define a certain communication channel, similar to tcp/ip (on which it builds), which is an IP adress and a port, i.e. a socket. In that sense, for our case, all the non-Coordinator nodes talk to the Coordinator in a more or less hardcoded way, their messages always go to/through to Coordinator by default, thus we do not need to give a Coordinator's ID in this message. The matter changes (slightly) once we need/want to route messages through a number of Coordinators, then the Coordinators might need to start adding more "route-information" frames/data for properly getting back a reply through the chain, but this is currently not our concern, I think. Currently, every Actor needs to be initialized with the knowledge of the socket info for the Coordinator, while that can be at localhost or on a remote IP.
Here the "names" come into play:
With that protocol, the header does not change and each hop just knows, how to reach the next hop.
Well, maybe we want to make a separate issue out of that, since it pops up again and again xD I have opened #22 to that end, and would guess it would be better to focus here on the message format.
Regarding the format, I guess we should first distinguish how precise we want to control what happens ourselves. If we are to use the PUB-SUB (with XPUB and XSUB sockets on the proxy) for a certain channel, the makeup of the frames which we control ourselves will be different than if we do it purely with ROUTER and DEALER sockets. The PUB-SUB sockets cannot be connected to ROUTER/DEALER sockets (without errors or undocumented/unpredictable results). So we have two options:
Currently I am not sure whether zmq will allow us to put more arbitrary frames in this type of socket channel.
zmq only considers the first frame for "topic filtering". The rest of the message is just a simple message consisting in 0 or more frames.
zmq only considers the first frame for "topic filtering". The rest of the message is just a simple message consisting in 0 or more frames
Ok, then we can actually make a common Header for both data and control protocols:
This will keep a bit of overhead in the data-channel (recipients and conversation ID are unnecessary here), but we can keep it the same across the different channels. The same is true for the logging channel (if we now separate it with its own PUB-SUB proxy), recipients and reply-reference are unnecessary. In general, having a message ID could be beneficial for debugging purposes, I think - if we can include the timestamp into an uuid directly (if it can be extracted sensibly) this would make things easy too. Alternatively, we can also just insert a timestamp frame after the message ID for good measure.
What I'm not clear on is how the Avro messages fit into the whole thing (e.g. this) -- are the Avro messages completely contained in the zmq payload? Where/how do we disntiguish what goes in the zmq header, and what into the Avro message? Do we duplicate information, or is the distinction clear anyway (e.g. zmq header: only message/routing metadata).
My idea is, that we have a routing header (I prefer the first frame), which contains the addresses and message ID etc., and then the payload in all the other frames. As default payload we define some protocol, for example the apache avro protocol.
So yes, we have frame 0 for Zmq routing stuff and frames 1 to n for the apache avrò payload.
My notes/draft on the message format, so far. For alignment/discussion, this is a quick first shot.
Message format
We have different message types:
Control messages
Data messages
Logging messages
TBC more?
[ ] F1: Do we want a plain-text (SCPI-like) protocol or a binary encoded one? E.g. do we have plaintext command verbs like
GET
or opcodes like0x02
that are defined in a table somewhere?Message structure
All messages have the same base structure:
Header
Timestamp
Message ID (possibly UUID v7 including timestamp) #16
1 Sender
0 or more recipients (pub-sub messages don't have a recipient afaict)
0 or 1 reply-reference
Probably payload length (for knowing when the checksum starts)(zmq delivers the whole message in one)Regarding the formatting of the header, see #33
[ ] H1: How does a coordinator get the message if the recipient is another Node? Do we also need to give the Coordinator's ID?
[ ] H2: Can we manage a common header for data, control and logging messages?
Control Payload
<CMD> [<args>]+
<CMD>
is from a command dictionarySET
GET
CALL
RESET
ERROR
LIST_PARAMS
LIST_ACTIONS
...
[ ] C1: Should an argument always be a key-value pair, or do we stay plain sequence of tokens? That is, is
prop1 15.2 prop2 true
2 or 4 args` What about arguments without value?[ ] C2: Is every token in the message a zeromq "frame"? Can we decide what goes in a frame?
Data Payload
TODO, didn't have time to define yet.
Logging Payload
TODO, didn't have time to define yet.
Checksum
TBD, some universally acceptable CRC check, I guess?