Open herkulessi opened 4 months ago
I have a usecase, where the selected backend/upstream is dependent on the order of some values in a config file among other things.
Do you mind expanding a bit more on your use case for needing to preserve order?
I wanted to use pingora as a reverse proxy in front of Synapse (the matrix server). Synapse does support sharding, however to work efficiently the requests have to be sent to the correct shard, otherwise Synapse will forward the request itself, with a penalty on performance. Synapse determines the "correct" shard/worker for a request by hashing (with sha256) a part of the request URI, converting parts of the hash to an integer, and using that modulo number of available workers as an offset into an array, whose ordering depends on the order the workers were configured in in the config file. That ordering can be arbitrary. Especially in containerized deployments, the IP address (which is used for the ordering in the BTreeSet
) of containers can't really be known in advance, meaning rewriting the synapse config to suit the proxies needs is not feasible either.
hey @herkulessi, it seems like this data structure would do the trick?
What is the problem your feature solves, or the need it fulfills?
I have a usecase, where the selected backend/upstream is dependent on the order of some values in a config file among other things. Currently, the backends are given to the
BackendSelection
as a&BTreeSet<Backend>
, which changes the order of the backends.Describe the solution you'd like
My first instinct was to change said
&BTreeSet<Backend>
to&Vec<Backend>
in the entirety of the load balancing code, however that might get expensive for the insertions and deletions in theStatic
ServiceDiscovery
.Describe alternatives you've considered
I don't know any data structure of the top of my head that ticks all the boxes (fast insert, fast remove, fast lookup and order preserving), but a hybrid approach, where only parts are changed to use
Vec
instead ofBTreeSet
might be possible. That should only incur a one-time penalty during startup. However, sinceBackendSelection
is nested fairly deeply within the load balancing code, that might not be possible as almost everything else in the load balancing crate appears to be handlingBTreeSet
s with the sole purpose of changing theBackendSelection
. Especially thetry_from_iter
should be able to preserve the upstream order, but that would imply a change of theBackends
struct, because aBackends
struct does hold theBTreeSet
of backends.Currently, the only way around that issue I can see is to basically recreate most of the load balancing crate, but with
Vec
instead ofBTreeSet