cloudflare / pingora

A library for building fast, reliable and evolvable network services.
Apache License 2.0
20.35k stars 1.1k forks source link

Consider Vec<Backend> over BTreeSet<Backend> in pingora::lb::selection::BackendSelection to preserve upstream order #109

Open herkulessi opened 4 months ago

herkulessi commented 4 months ago

What is the problem your feature solves, or the need it fulfills?

I have a usecase, where the selected backend/upstream is dependent on the order of some values in a config file among other things. Currently, the backends are given to the BackendSelection as a &BTreeSet<Backend>, which changes the order of the backends.

Describe the solution you'd like

My first instinct was to change said &BTreeSet<Backend> to &Vec<Backend> in the entirety of the load balancing code, however that might get expensive for the insertions and deletions in the Static ServiceDiscovery.

Describe alternatives you've considered

I don't know any data structure of the top of my head that ticks all the boxes (fast insert, fast remove, fast lookup and order preserving), but a hybrid approach, where only parts are changed to use Vec instead of BTreeSet might be possible. That should only incur a one-time penalty during startup. However, since BackendSelection is nested fairly deeply within the load balancing code, that might not be possible as almost everything else in the load balancing crate appears to be handling BTreeSets with the sole purpose of changing the BackendSelection. Especially the try_from_iter should be able to preserve the upstream order, but that would imply a change of the Backends struct, because a Backends struct does hold the BTreeSet of backends.
Currently, the only way around that issue I can see is to basically recreate most of the load balancing crate, but with Vec instead of BTreeSet

andrewhavck commented 4 months ago

I have a usecase, where the selected backend/upstream is dependent on the order of some values in a config file among other things.

Do you mind expanding a bit more on your use case for needing to preserve order?

herkulessi commented 4 months ago

I wanted to use pingora as a reverse proxy in front of Synapse (the matrix server). Synapse does support sharding, however to work efficiently the requests have to be sent to the correct shard, otherwise Synapse will forward the request itself, with a penalty on performance. Synapse determines the "correct" shard/worker for a request by hashing (with sha256) a part of the request URI, converting parts of the hash to an integer, and using that modulo number of available workers as an offset into an array, whose ordering depends on the order the workers were configured in in the config file. That ordering can be arbitrary. Especially in containerized deployments, the IP address (which is used for the ordering in the BTreeSet) of containers can't really be known in advance, meaning rewriting the synapse config to suit the proxies needs is not feasible either.

ahmedtadde commented 4 months ago

hey @herkulessi, it seems like this data structure would do the trick?