Tracking issue for remote actor ref

Thomasdezeeuw commented 5 years ago

Current idea is to make a quick (and bad) implementation of remote message passing based on UDP and JSON (not encrypted). I know this is bad, but it gives me a starting point to get an idea of how the API should look.

For this the following would be needed:

Port to listen for incoming messages on each node

To start lets make this a fixed port. Later this can be dynamic, maybe per process.

A process that handles incoming messages

A shared process (to ensure progress) that handles these incoming messages. This would likely be an actor with which other actors can register themselves using there ActorRef. This actor would then receive a message, attempts to de-serialise it for the correct registered actor and send the message using the ActorRef.

This process/actor can run on the main thread, but that is not a great solution. What would be better is a work stealing scheduler (#254).

Mechanism to deliver messages to the correct actor.

Do we want explicit or implicit registering of actors?
One actor per node? Per (distributed) system?
Allow registering a group of actors using the same name?

Mechanism to serialise/de-serialise messages

Need a way to describe the format of a message to create bytes from it and reverse that operation. Serde can handle this for us, would make it a public dependency.

Before sending a message we can agree with the peer on a data definition, i.e. what to wire format to use for the message. To start we'll use JSON as that is self-describing.

Mechanism to create an `ActorRef` to a remote actor.

Expose ProcessId/WakerData? -- Don't like this as it not user facing and I don't care to expose it.

New concept;ActorUrl? Could look something like "$node/$actor", e.g. "node_01/value_store".

Cons:

Require explicit registering actors. Maybe want to be able to send ActorRef to actors on a remote machine so they can communicate. But that won't be possible with actors that are not explicitly registered first.
Requires a unique name for registered actors.
A string-ly typed API.
Creates a hidden/unclear dependency between two actors, would like to be more explicit about it.

Do we want version? E.g. "node_01/value_store/v1". This would require user defined versioning.

We would send something like:

{
    "to": "value_store",
    "version": 1,
    "data": {}
}

Where data is the actual message send by the actor.

Paper about general, fast rpc: https://blog.acolyer.org/2019/03/18/datacenter-rpcs-can-be-general-and-fast/, could be of interest.

Thomasdezeeuw commented 5 years ago

Needed:

[ ] (de)serialisation of messages #214.
[x] Machine actors #213.
[ ] How to connect to peers?
[ ] How to connect to a remote actor?
[ ] failure detection of some kind.

Thomasdezeeuw commented 5 years ago

I want to be able to link multiple actors together over a network, where the (unix) processes aren't running the same binary. So different versions of the users' code and of Heph (post v1) should be able to communicate. For this a protocol needs to written, including a way to share definitions of the messages and how to determine whether or not both sides of the communication see them as the same, where it should be possible to extend the data structure in newer versions of the code, e.g. adding fields. I intend to support and encourage rolling updates of the users' code (including new versions of Heph).

We'll also likely need to some kind of crash detection, even though we provide no guarantees for message delivery.

As for the protocol I think TCP might have too much overhead and UDP should be preferred, as again the message delivery is not guaranteed. Alternative we can let the user decide on this, providing both TCP and UDP based connections and maybe even something based on QUIC.

Thomasdezeeuw commented 4 years ago

Consider using compression when sending messages: zstd with level 3, seems appropriate.

Thomasdezeeuw commented 4 years ago

Messages need some kind of id so that we reply to messages to support ActorRef::rpc.

Thomasdezeeuw commented 3 years ago

Commit 05d0ff1f8fff356535077788573ffbc43b864745 removed the old, unused RemoteRegistry. Replace it with a proper version (currently in the net-relay branch).

Thomasdezeeuw commented 3 years ago

Related #307.

Thomasdezeeuw commented 3 years ago

Idea for determining the topology:

Create a UUID for each runtime and for each actor spawned (also see ActorURL). Using the UUID for the runtime we can create a topology of all connected peers.

type Peer = (SocketAddr, UUID);

This way even if peers are connected using different IP/ports we can still match them using the UUID. E.g. peer A knows peer B at (123, ABC) and peer C knows a peer D (456, ABC) we can see that the UUID match up and see that B and D are the same peer.

To create a topology we can then send pings (messages) to all peers known to a certain runtime. From that we can create the following connections.

struct Connection {
    src: Peer,
    dst: Peer,
    latency: Duration,
}

And collecting all of these will create a topology for us.

Thomasdezeeuw commented 2 years ago

For encryption we could maybe use something like https://github.com/RustCrypto/traits, e.g. https://docs.rs/aead.

Thomasdezeeuw / heph