Adding TTL to proxied messages

nash8114 commented 2 months ago

A message proxied to another node by the DynamicSupervisor can be proxied again. This is clearly evident when using a large number of nodes and picking Horde.UniformRandomDistribution as a distribution strategy. Pushing your luck with that strategy 🍀

This commit contains an implementation which limits the amount of times a message can be proxied before expiring, by adding a TTL which works similar to the TTL on IP packets.

The default TTL is :infinity, which means the implementation is backwards compatible. Message with TTL :infinity can bounce around between nodes forever. The max TTL can be set to any integer via the new :proxy_message_ttl option. Each "hop" decreases the TTL for a message by 1 one. When a message with a TTL of zero needs to be proxied, an error will be returned to the reply_to process.

With a distribution strategy of Horde.UniformDistribution the issue of passing messages around forever is unlikely, as nodes tend to agree on the outcome of choose_node. However, a recent incident whilst upgrading to OTP 27 brought this issue to light, as the underlying algorithm for choosing nodes was broken on our already upgraded nodes. This causes message to be proxied between our nodes infinitely.

Please feel free to pass any form of judgement on the implementation. This is just a jumping-off platform and it can only go uphill from here 👍

With TTL of 2:

graph TD;
p[process A] -- call start_child() --> A[node A];
A -- proxy start_child(ttl: 2) --> B[node B];
B -- proxy start_child(ttl: 1) --> C[node C];
C -- reply {:error, :proxy_operation_ttl_expired, ...} --> p;

nash8114 commented 2 months ago

I've pushed a new commit. The pushed changes should help fix an issue upgrading to this new version. The implementation now supports both tuple size 3 and 4 for :proxy_operation. When :proxy_message_ttl is set to :infinity (the default), proxy_to_node will use tuple size 3. This means nodes running the new version and the old version can still proxy messages to each other. If during the upgrade :proxy_message_ttl would be set to an integer value, the upgrade will cause :proxy_operation messeages from upgraded nodes to old nodes to fail. The CHANGELOG would need to reflect this as an upgrade risk.

derekkraan commented 2 months ago

Yes, please do add an entry to the changelog for this.

derekkraan commented 2 months ago

The approach you are taking in this PR is to send an error when TTL goes to 0.

But could we also just start the process when TTL is 0?

What do you think?

nash8114 commented 2 months ago

That sounds like a great idea.

I've avoided it so far as it means that proxy_to_node would need to be able to make a distinction between different messages. Currently it is completely message agnostic. The only two message types it currently handles are :start_child and :terminate_child. Performing :start_child locally when TTL runs out makes a lot of sense. The same is not true for terminate_child.

So if you're OK with proxy_to_node being message specific, I can build in an exception for :start_child. But how to handle an expired TTL for :terminate_child? This should not occur of course, except when the TTL is set to 0 or there is some corruption in the state (where two nodes disagree on who owns a process)

nash8114 commented 2 months ago

Pushed an update which changes the start_child logic to perform add_child locally if the TTL has expired. Tests have been updated accordingly.

On a another note, I am considering returning an error when starting a Horde.DynamicSupervisor with a :proxy_message_ttl set to zero, as obviously no one should ever set it to zero, and it would cause issues for terminate_child.

There is also still the issue that :proxy_message_ttl defaults to :infinity, which allows for upgrading seamlessly from a previous version to this version. But at what point would the :proxy_message_ttl default to something else? Or should the README indicate that setting the proxy TTL (after upgrade) is recommended?

With TTL of 2:

terminate_child is not likely to loop, as the node on which a process is running is known. If it does loop for some odd reason, the message would expire.

graph TD;
pa[process A] -- call start_child() --> A[node A];
A -- proxy start_child(ttl: 2) --> B[node B];
B -- proxy start_child(ttl: 1) --> C[node C];
C -- TTL expired, perform add_child and reply {:ok,&nbsp;pid} --> pa;

pb[process B] -- call terminate_child() --> Ab[node A];
Ab -- proxy terminiate_child(ttl:&nbsp;2) --> Bb[node B];
Bb -- reply :ok --> pb;

derekkraan commented 2 months ago

If we document it in the documentation and in the changelog, that should be enough. I'll emphasise it when doing the usual tweets so that hopefully more people get the message. But maybe it's also something most people will only discover if they also have the issue in question.

nash8114 commented 2 months ago

I'll write something about the option in the readme.

In normal cases the TTL should not be necessary, but there are some cases which are improved by this. Most notably:

TTL allows a seamless migration to a different distribution algorithm. Today migrating to a different algorithm would likely cause loops. We are considering changing algorithm, and that is difficult to do unless you're really strategic about it.
TTL prevents the UniformRandomDistribution algorithm from passing messages too often (image 100 nodes with UniformRandomDistribution, each having a 1% chance of actually handling a start_child, and 99% change of playing hot potato)
TTL prevents issues when the algorithm fails, like we experienced in our application due to an Erlang upgrade.

derekkraan / horde

Adding TTL to proxied messages #276

With TTL of 2:

With TTL of 2: