In the current implementation of the latency-based routing, the probability of choosing a node as a routing URL is proportional to the node's score. The score is calculated as an inverse value of the moving average of latency measurements over some sliding window. Hence, nodes with smaller latencies produce higher score values and are more likely to be selected. However, in the current approach when a node was unavailable (within the sliding window), it was completely removed from routing. This was a "harsh" punishment, which is not always optimal, as latency can experience sporadic spikes.
In this PR we introduce a more sophisticated and customizable way of calculating node's score, where the combination of node's latency and availability are used to produce the final score. Node's score is now penalized, if it was unavailable within the sliding window measurements and not set to zero as before. A node is only removed from routing if it was unavailable in the last measurement.
Approach:
We store two separate sliding arrays: availabilities and latencies for each node
Latency score is computed as an unnormalized value:
$availability_i=0$, if the node is unavailable and $1$ if it is available.
Weights $w_i$ are computed via an exponential decay formula:
w_i=\exp(−\lambda * i),~~i = 0, ..., win\_size-1
This allows to take recency factor into account, i.e. more recent observations contribute to the score with higher weights.
Note, users can also provide their custom weights. For example, to exclude recency factor, all $w_i=1$ could be provided as input.
The combined score is computed via a simple product:
This allows to emphasize the importance of both metrics in the final score.
Example:
let latencies_secs = [None,Some(0.190),Some(0.195),Some(0.180),Some(0.195),Some(0.180),Some(0.185),Some(0.2),Some(0.192),Some(0.215),Some(0.175),None,Some(0.190),None,Some(0.187),None,Some(0.165),Some(0.189),None,Some(0.192),Some(0.200),Some(0.187),Some(0.178),Some(0.185)]];
How Has This Been Tested?
Additional tests in latency_based_routing.rs have been introduced.
Description
In the current implementation of the latency-based routing, the probability of choosing a node as a routing
URL
is proportional to the node's score. The score is calculated as an inverse value of the moving average of latency measurements over some sliding window. Hence, nodes with smaller latencies produce higher score values and are more likely to be selected. However, in the current approach when a node was unavailable (within the sliding window), it was completely removed from routing. This was a "harsh" punishment, which is not always optimal, as latency can experience sporadic spikes.In this PR we introduce a more sophisticated and customizable way of calculating node's score, where the combination of node's latency and availability are used to produce the final score. Node's score is now penalized, if it was unavailable within the sliding window measurements and not set to zero as before. A node is only removed from routing if it was unavailable in the last measurement.
Approach:
availabilities
andlatencies
for each nodeNote that inverse of latency is used to produce higher scores for smaller latencies.
$availability_i=0$, if the node is unavailable and $1$ if it is available.
This allows to take recency factor into account, i.e. more recent observations contribute to the score with higher weights. Note, users can also provide their custom
weights
. For example, to exclude recency factor, all $w_i=1$ could be provided as input.This allows to emphasize the importance of both metrics in the final score.
Example:
How Has This Been Tested?
Additional tests in
latency_based_routing.rs
have been introduced.Checklist: