shotover / shotover-proxy

L7 data-layer proxy
https://docs.shotover.io
Apache License 2.0
84 stars 16 forks source link

Cassandra Improved routing #962

Open conorbros opened 1 year ago

conorbros commented 1 year ago

Improved routing

Opening this issue to discuss the design for improved routing using the best of two choices algorithm.

Power of two random choices

The power of two random choices algorithm is: instead of trying to select the absolute candidate using a signal, pick two candidates and choose the better option of the two, avoiding the worst choice.

The signal used will be the number of in-flight requests to each node from the Shotover instance (per connection is also an option).

Healthy vs. Unhealthy nodes

This can be further improved by measuring the time that the node last responded and using this value with the number of in-flight requests to determine if the node is "healthy", meaning processing requests at a reasonable pace. This will prevent us from routing requests to a node that is unhealthy.

However this indicator is not included in the signal for the two random choices algorithm, instead only used to select the pool of candidates for that algorithm. This prevents the scenario where a majority of nodes are "unhealthy" meaning that this metric is not suitable to use given the current workload or expected latencies.

References: https://www.datastax.com/blog/improved-client-request-routing-apache-cassandratm

Shotover implementation

Since we also want to prioritise choosing a node in the same rack as the Shotover instance we will have to modify this.

What this doesn't account for is what happens when the first replica is a local rack one and the other is a remote rack. But the remote rack one has fewer in-flight requests, meaning it is the better choice according to that metric. Our choice here is to either:

What one to choose depends on whether the same rack or less in-flight requests is the best one to choose based on. I think we should go with in-flight requests first and then decide if same rack might be a better metric.

As long as we make sure we are always including same rack replicas in the choice whenever possible they will still be more prioritised than remote rack replicas.

Shotover changes

rukai commented 1 year ago

Nice writeup!

CassandraSinkCluster is currently designed to route to nodes on a single configured rack only. I dont think an outside rack node having less in-flight requests is a good reason to route to it. We want to keep routing to outside racks to an absolute minimum, so outside racks having lower in-flight requests is actually a good thing! That said we do still want to make use of outside racks when all our internal replicas are down.

I think that if our internal rack is performing poorly then its up to the client to route to a different shotover node to access another rack. Assuming they are running the java driver or similar they should be running the same best of 2 choices algorithm.