RMB redundancy proposal

muhamadazmy commented 11 months ago

Related to #140

Redundancy implemented for rmb by means of federation. It simply means when u connect to your rmb of choice, you make sure your public information on tfchain states which rmb server you are connected to.

Right now, a client on sending a message will include federation information in the message, this will help the rmb relay to decide if he need to deliver the message to one of his directly connected clients, or send it to a remote rely.

This design has 2 main flows:

The client need to know where the other peer live to include this data in the message before send it to your relay.
While this information is publicly available on the chain, it shouldn't be queried every time. Hence clients usually cache this information. Now the problem is that if the remote peer decide to connect to a different relay because his main one is no longer available, the cache on the local peer will be out of date until the client decides it's no longer valid and need renew. It was done like that to reduce number of queries needed to be done against the chain.

The summary of the problem is that propagation of changes in peer information does not propagate fast enough to have a seamless connection with remote peers in case of relay changes.

Proposal:

Client should not include federation information in messages, it's the relay that need to decide this (local or forward)
If client switches to a new relay it needs to update it's information on the chain (that is already the case, but next step depends on that)
Relay is the one that caches twin information.
- The cache is never invalidated once client information is retrieved theoretically (if missing from chain)
- Relay also handle (reliably) all twin update/create events to keep an exact copy of twin information
- reliable sync with event can be done by scanning blocks from the moment the relay starts.

This way clients can send messages without the need to know where the remote peer lives, relay(s) has fresh information about the twins that is updated in almost real time. then relay(s) can decide where to push the message.

IMPORTANT NOTE:

Local peer will still to get remote peer information to use the pk to do e2e encryption this is why we thought since we get this information anyway that we also have the federation information in the message so the relay has to do nothing at all regarding where to push the message.

The client will still have to get and cache twin information locally but the PK almost never change after generation, which means clients can cache this for longer, and relay will then figure out where to send the message as explained above.

despiegk commented 11 months ago

found some issues, good idea, but I am afraid we're not there yet

muhamadazmy commented 11 months ago

Thabet told me about json vs protobuf, I wanted to state that clients that relies on rmb-peer as your agent use full json protocol (which is local over local redis). but for many other reasons the rmb-peers talk to each other and the relay with protobuf for many reasons (also any client that talks to the relay directly):

protobuf is much more compressed than json, it's good for wire communication
protobuf supports binary without an extra overhead unlike json which can only hold utf8, means for binary payloads it has to be base64 first which add a lot of over head on the message size.
protobuf is natively supported by almost all programming languages, so you can easily build a compatible client that uses the relay directly without the need for rmb-peer, we already have clients in Go, and TypeScript.

sameh-farouk commented 11 months ago

I read the original issue and the proposal here, but I don’t see how they are related. I believe the proposal here aims to solve the third point in the issue, which is:

make sure its always redundant, so even when rmb server falls away it still works,

I’m not clear what rmb server means, but I guess it’s rmb relay, right?

My understanding of the issue is that we want to have redundancy for the relays, so that if peer 1 is connected to relay A, and peer 2 is connected to relay B, they can still communicate even if one or both relays fail at some point.

However, the proposal here (RMB redundancy proposal) seems to focus on avoiding cache staleness, by scanning new blocks for twin info changes and updating the cache accordingly in near a realtime. This might improve RMB availability a bit, but it won’t prevent downtime when a relay fails and the on-chain relay info is not updated yet (This could take a far longer time actually). Also, I don’t see how this proposal adds any redundancy. It is a good one but It looks like it’s trying to solve a different problem than the one in the issue (at least to my understanding).

Why don’t we use a traditional fail-over mechanism instead? We could allow on-chain relay field to store more than one IP address. The first IP address would point to the default relay, and the other IP addresses would point to identical redundant relays.

Then the client or rmb-peer could query and cache the list of IP addresses for each twin. If the first IP address times out, it would try the next IP in the list, and reorder the cached twin info to put the working IP address first.

This way, by enabling failover, the communication between twins can continue even if a relay fails, as the client will eventually find another operational relay.

As for the cache staleness issue, I wonder how often twin info changes. Maybe we could simply use a lower cache period, or skip the cache and re-query the twin info if a request to a relay times out (we would need some kind of circuit breaker as well). Scanning every new block from every relay would require a separate connection to blockchain chain node for each relay for the purpose of updating twin info, which might be costly despite the data rarely getting updated. I think one reason we have the garphql to offload unnecessary read operations from the blockchain network and to have the ability to access on-chain data without having to directly interact with the blockchain nodes, which can be costly, and unreliable sometime.

Also If we use the redundancy method described above, the cache staleness would be less relevant. again how often would the twin info change? Probably not more often than a DNS A record. But we still can’t solve the problem of DNS cache staleness (cache is everywhere 🤷🏻‍♂️).

Again, I'm not saying that the proposal is not good. I'm just trying to better understand why we are proposing / implementing this and that we all share same understanding to the issue we are trying to solve, before going any further or writing any code.

Please @muhamadazmy let me know if any of this makes sense.

sameh-farouk commented 11 months ago

We have decided to implement redundancy and failover into the system to achieve high availability after discussing the topic with Azamy. The required changes are summarized as follows:

The on-chain relay field should allow multiple hostnames.
The client will not include the federation information in the message. The relay will determine whether to deliver the message to one of its directly connected clients or to a remote relay.
The client will establish connections with several redundant relays at the same time.
When a relay receives a message, it will look up the target twin’s information and typically receive a set of relays or hostnames for the same twin. The local relay will have priority over others if it is in the set. Otherwise, the relays will be used in a sequential order (we could also sort them by other criteria such as random or round-robin).
If a relay fails to send the message to the relay at the first hostname, it will try the next hostnames or relays in the set. This enables failover, because if a relay is down, a message will eventually reach another redundant relay and the client through one of its ws connections.
- The cached list could be reordered to list hostname for the last known working relay first.
To prevent cache staleness, we would reduce the cache ttl. also clients with redundant relays will rarely need to update this information on-chain, which makes the issue less relevant and lowers the cost and frequency of such unnecessary transactions.

Is there anything else can be added @muhamadazmy ?

sameh-farouk commented 10 months ago

Update:

I completed the core feature and opened a draft PR for early reviewing.
I still working on the optimization part. More specifically, the socket reordering in rmb peer and the relay domains reordering in redis cache for substrateTwinDB.

sameh-farouk commented 9 months ago

Update: Azamy suggested an interesting idea of having a relay ranking system that is stored in-memory to optimize the federation routing. I will plan and implement this next.

sameh-farouk commented 9 months ago

Update:

The ranking system was implemented and waiting for review. this was last thing in my list in the optimization part.
federation tests was adjusted.

A-Harby commented 8 months ago

Verified, I used the latest rmb release v1.1.1 and updated my twin relays From gridproxy And I tried send a message over those relays and even closed two relay with the help of @sameh-farouk, and both message sent successfully.

And I tried again later to add some more relay and they were added with no error.

threefoldtech / rmb-rs

RMB redundancy proposal #141

Proposal:

IMPORTANT NOTE: