Add support for gossip entry timestamps which are set when an entry is created and propagated to the rest of the cluster. Nodes can then use these timestamps to calculate how long it takes for entries to be propagated around the cluster.
Those times can then be exposed as metrics, then used to configure gossip or detect when the cluster is overloaded
Versioning
Adding a new field will require a new gossip protocol version, so nodes must support both the existing version (0) and the new version (1).
Evaluation
Adding these metrics can also be used to evaluate the scaling limits of gossip. Such as extend piko test workload upstreams to support to add --churn flags indicating how often each upstream should reconnect.
That can then be used to understand how much churn a cluster with default gossip settings can support before latency exceeds some threshold (say 10 seconds).
(Can also proxy each gossip node to inject latency and dropped messages)
Add support for gossip entry timestamps which are set when an entry is created and propagated to the rest of the cluster. Nodes can then use these timestamps to calculate how long it takes for entries to be propagated around the cluster.
Those times can then be exposed as metrics, then used to configure gossip or detect when the cluster is overloaded
Versioning
Adding a new field will require a new gossip protocol version, so nodes must support both the existing version (0) and the new version (1).
Evaluation
Adding these metrics can also be used to evaluate the scaling limits of gossip. Such as extend
piko test workload upstreams to support
to add--churn
flags indicating how often each upstream should reconnect.That can then be used to understand how much churn a cluster with default gossip settings can support before latency exceeds some threshold (say 10 seconds).
(Can also proxy each gossip node to inject latency and dropped messages)