wolf-null / resource-network-sim-v2

Simulating wealth distribution in logistic networks. Second version: more versatile
GNU General Public License v3.0
0 stars 0 forks source link

[Architecture]: Check: Is join() can actually join a new host? #1

Open wolf-null opened 3 years ago

wolf-null commented 3 years ago

The problem

Straightforward joining Nodes to the ProcessingHost during it's execution process is problematic because

wolf-null commented 3 years ago

One can kill two birds with the same stone by adding specialized control signals to ProcHost and MasterHost classes, like

"Add the nodes" (of a specified type, with a specified database and, maybe, a pre-cooked input stack) to the specified host.

This "add the node" affects the following problems (in implementation order):

wolf-null commented 3 years ago

It looks like that one can create a ProcessHost with some initial nodes And then to add or to delete it

wolf-null commented 3 years ago

join() organizes initialization of ProcessHosts and attaches nodes to it.

There is the following operations included in the join() at the moment:

This function can operate local MasterHosts only. It will work unpredictably for virtual hosts (network hosts, for instance). This function can be logically decomposed onto two functions:

The problem is that the way of initializing different subvariants of the Host class might differ. But anyway, it is recommended to create a separate process for each host (whatever it's local, or virtual, or etc). This is due to the universality of the synchronization interface the multiprocessing module can provide. So, actually, not much of a problem:

Also, it is architecturally more prefered to make Nodes and Hosts to interact right the same way: by receiving control signals and data signals, and by holding all the config in the _data field of the class considered. So all configurations of the Host (like routing) is also stored in the _data field. The problem is that some of these fields (like process handles) are not serializable, not reconstructable, can be cached or passed into another process in an ordinary way.

Need to think on the latter.

This may lead us to consideration nodes as hosts or hosts as nodes. But is it worth it?

One can hold the join() function to implement user-friendly quick host infrastructure building.

wolf-null commented 3 years ago

Signal packaging?

Once one is in need of configuring ProcessHost or transfering dozens of signals from peer A to peer B for another reason, one will face a problem of signal routing overload. This can slow down all signal routing process and, if to run it in async mode, to wipe out other transactions.

In that case, one can propose signal packages: a series of signals from a single A to a single B to be wholly transferred.

There are two ways of packaging:

Since there are cases, it would be more flexible to allow nodes to implement or not to implement that feature.

At the moment, Node class doesn't implement any exec() routines.

One can implement all these, but this will ruin the standard since the developer doesn't know which method of serialization is passed to the node, there is an ambiguity that is fixed by code overlapping.

The core problem is that the sender node is not really supposed to know is the receiving node is ready to deserialize or not.

Host signal packaging

Essentially different way of packaging is to operate it at the host-host level, so signal serialization is hidden from a node. If there is a task to process input signals in a block there is no big deal: will the Host send n signals to the node OR if the Node will deserialize the package and process right the same amount of nodes as signals.

Deserialization is twice transparent for a Node (compared to previously proposed solutions): there is no need to complicate the Node class or anything else. Deserialized messages came to the destination straight consequently so it's also processed mostly simultaneous.

The serialization problem remains. And there is two solutions (both engages special SerializedSignal class):

[SOLUTION] But do we really need serialization?

The idea of serialization originates as the problem of adding multiple nodes to the remote host is arisen. This requires the transmission of lots of data messages (imagine transferring a large number of nodes).

But, for this particular case, why not transfer the whole database instead of transferring dozens of data signals?

Well, actually, this is the solution. One can push a big database update:

This decreases the number of messages sent down to the number of nodes spawned.

Maybe this is it...