apache / cassandra-gocql-driver

GoCQL Driver for Apache Cassandra®
https://cassandra.apache.org/
Apache License 2.0
2.58k stars 620 forks source link

New Feature: Latency Aware Host Selection Policy #1078

Open robusto opened 6 years ago

robusto commented 6 years ago

This is another host selection ("load-balancing" policy) that exists in the Datastax drivers that would help with selecting an appropriate coordinator at query time. Having this available for operational and performance concerns when chained with DCAware and/or TokenAware in larger clusters.

Taken from the Datastax driver's Java doc:

When used, this policy will collect the latencies of the queries to each Cassandra node and maintain a per-node latency score (an average). Based on these scores, the policy will penalize (technically, it will ignore them unless no other nodes are up) the nodes that are slower than the best performing node by more than some configurable amount (the exclusion threshold).

The latency score for a given node is a based on a form of exponential moving average. In other words, the latency score of a node is the average of its previously measured latencies, but where older measurements gets an exponentially decreasing weight. The exact weight applied to a newly received latency is based on the time elapsed since the previous measure (to account for the fact that latencies are not necessarily reported with equal regularity, neither over time nor between different nodes).

Once a node is excluded from query plans (because its averaged latency grew over the exclusion threshold), its latency score will not be updated anymore (since it is not queried). To give a chance to this node to recover, the policy has a configurable retry period. The policy will not penalize a host for which no measurement has been collected for more than this retry period.

Java implementation: LatencyAwarePolicy.java

It seems like we would start a single, dedicated goroutine in the policy's Init() method to update latency scores for all hosts in the pool on a configurable interval.

martin-sucha commented 2 years ago

For the record, there is a related host selection policy in Kiwi.com's fork, although that one is somewhat specialized, so cannot be used directly.