[Architecture]: Signal packaging?

Signal packaging?

Once one is in need of configuring ProcessHost or transfering dozens of signals from peer A to peer B for another reason, one will face a problem of signal routing overload. This can slow down all signal routing process and, if to run it in async mode, to wipe out other transactions.

In that case, one can propose signal packages: a series of signals from a single A to a single B to be wholly transferred.

There are two ways of packaging:

Serializing and deserializing packages inside nodes. Nodes then are responsible for understanding packages.
In fact, this can be a built-in feature for the Node base class, so the base class deserializes a package

Since there are cases, it would be more flexible to allow nodes to implement or not to implement that feature.

At the moment, Node class doesn't implement any exec() routines.

[ ] The Node can be written so as to deserialize input data automatically when it's emit() function is invoked by the host. If the developer wants to process these signals in he's own way, he can override the emit() function.
- Makes deserialization transparent for a receiver (until one doesn't need to intervene this process - which is also pretty easy)
- But yet have no idea how to make serialization as transparent as deserialization would be.
[ ] One can add special functions to Node base class, and allow end-users (developers) to call or not to call it in their own implementations of Node.
- This one may sound cool but also overcomplicates the Node base class and the development of a user-defined Node subclass
[ ] On the other hand, one can add deserialization tools for the signal itself and let developers to call deserialization from the special serialized signal class.
- A developer is already expected to parse input signals by he's own (receiving buffer and traversing it yet operated manually)
- But if a user identifies a signal as the SerializedSignal (as been mentioned, traversing input signals via "subclass tests" is already standard) it can call it's deserialize() method and push the result to the end of the buffer
- Adding to the end of the buffer will violate signal income order. This might and might not cause trouble. On the other hand, one can process serialized signals "here and now" or something like recursively.
- Forces developers to do the same additional work in the exec() method for all Node inheritors.
- [ ] Check: Will these functions seriously enlarge the size of the signals? (since the number of signals will soon get really large it may become an important problem)
- [ ] From the other hand, one can automate the process: if the developer wants to avoid the additional work he can use NodeSerial (or something like that) as the base class for he's nodes in favor of taking Node as the base class. This "intermediate" class guarantees that all standard serialization signals will be deserialized and passed back to buffer in an expected order. The best way to do it is using emit() function (allows avoid resource-intensive "re-buffering"). This will slow down ProcessHost (since emit() is executed there).
- The deserializaion then is transparent, but serialization isn't. One can implement signal serialization as the dualistic method inside NodeSerial class, for instance, by using *msg argument instead of a single signal. Detais is to be discussed below (next comment).
- If the developer wants to process these signals by he's own, he can use pure Node as the base class (or NodeNonSerial - another inheritor with a guarantee of lack of deserialization mechanisms).

One can implement all these, but this will ruin the standard since the developer doesn't know which method of serialization is passed to the node, there is an ambiguity that is fixed by code overlapping.

The core problem is that the sender node is not really supposed to know is the receiving node is ready to deserialize or not.

Host signal packaging

Essentially different way of packaging is to operate it at the host-host level, so signal serialization is hidden from a node. If there is a task to process input signals in a block there is no big deal: will the Host send n signals to the node OR if the Node will deserialize the package and process right the same amount of nodes as signals.

Deserialization is twice transparent for a Node (compared to previously proposed solutions): there is no need to complicate the Node class or anything else. Deserialized messages came to the destination straight consequently so it's also processed mostly simultaneous.

The serialization problem remains. And there is two solutions (both engages special SerializedSignal class):

[ ] Manual serialization. By adding an additional method for sending messages (or by just building it manually inside the node logic)
[ ] Automated serialization. If a ProcessHost is going to send a series of messages from the same source to the same destination, it unites them in a package

[SOLUTION] But do we really need serialization?

The idea of serialization originates as the problem of adding multiple nodes to the remote host is arisen. This requires the transmission of lots of data messages (imagine transferring a large number of nodes).

But, for this particular case, why not transfer the whole database instead of transferring dozens of data signals?

Well, actually, this is the solution. One can push a big database update:

[ ] Transferring the whole _data with overwriting. Just a SupersetSignal (like the SetSignal)

This decreases the number of messages sent down to the number of nodes spawned.

Maybe this is it...

Originally posted by @wolf-null in https://github.com/wolf-null/resource-network-sim-v2/issues/1#issuecomment-923957226

wolf-null / resource-network-sim-v2