scylladb / scylla-go-driver

Experimental, high performance Scylla Driver, University of Warsaw students' project
Apache License 2.0
151 stars 7 forks source link

Squash all HostSelection policies #255

Closed mmatczuk closed 2 years ago

mmatczuk commented 2 years ago

We have:

The RoundRobinPolicy is a special case of DCAwareRoundRobinPolicy when there is a single DC and should not be used in multiple DC setting.

With new API the round robin behavior can be put to the TokenAwarePolicy.

The new API can be specified as

type HostSelectionPolicy interface {
    Node(QueryInfo, int) *Node
}

Node function returns n-th node for a query - typically 0. The position information can be kept in Query and used in pickConn function. This API has many benefits, it does not allocate memory on heap for slices or filter functions it is also lazy and avoids evaluation of all replicas when only the first is needed.

mmatczuk commented 2 years ago

cc @piodul

martin-sucha commented 2 years ago

Should it be optionally possible to carry some state for the given query?

Initially I thought some state would be needed for an option like gocql.ShuffleReplicas(), but that could be implemented by having some random number in QueryInfo and shifting the replica index modulo number of replicas in the topology.

However, for a policy like kiwicom/gocql/TokenAwareLatencyHostPolicy it seems that some state would need to be carried between the initial attempt and the retry as such policy can't rely just on the topology pointer being the same for the initial attempt and the retry, since it depends on the measured latencies of queries. This state does not necessarily need to be allocated for every query, it could be recomputed every few seconds or so.

I'm thinking about an interface like

type HostSelectionPolicy interface {
    // Node determines the node used to execute the query.
    // attempt number specifies how many times we tried to execute the query before, i.e. attempt is >= 1 for retries.
    // Node also receives state from the previous indication or nil for the initial attempt.
    Node(queryInfo QueryInfo, attempt int, state any) (node *Node, newState any)
}

For the builtin policy the state would always be nil and any new implementations could use the state if necessary.

Note that it is not clear whether I really need a policy to measure the latency, I haven't re-evaluated that yet, so of course we don't need to add the state now. I just wanted to leave a note here for future reference.

mmatczuk commented 2 years ago

IMO the latency measurements is an external component that can be plugged to a new implementation on construction and no additional function parameter is required.

martin-sucha commented 2 years ago

IMO the latency measurements is an external component that can be plugged to a new implementation

That is true.

What I meant is that retries should go to a different node than the original attempt. The built-in policy always guarantees that just by iterating over the topology. The topology is "snapshotted" at the time when the QueryInfo is created because the topology uses read-copy-update. So even if the topology changes between the original attempt and the retry, the builtin policy still uses the same topology for the retry as the original attempt.

A latency aware policy would select the node with the lowest latency for the initial attempt, second lowest latency for the retry etc. However, the latency measurements can change over time and there isn't a similar read-copy-update mechanism that could be used to snapshot the measurements, so the ordering of nodes by latency can change between the initial attempt and the retry. This could lead to retry being sent to the same node as for the initial attempt. If the policy could snapshot the measurements at the initial attempts (and store pointer to this snapshot in the state), it would be able to guarantee that retries are never made to the same hosts as before.

I don't know how important it is in practice to guarantee that retries are always to different nodes than before.

We should probably also explicitly document in the interface how the caller of Node() determines how many attempts to try, resp. how the policy indicates that there are no more hosts to try. I'd like to avoid situation as in https://github.com/gocql/gocql/issues/1259#issuecomment-526130900 for this new interface.

mmatczuk commented 2 years ago

Any additional context info can go to QueryInfo if needed.

martin-sucha commented 2 years ago

That is a good point, extending the QueryInfo would work in that case. :+1: