We run several sets of Fabio instances, with the Consul backend, that register 20k and 60k routes. On the low end Fabio takes around a minute to process changes to the route table, and on the high end it's more like 5 to 6 minutes while Fabio is handling load.
I've been doing some experimenting and profiling of Fabio's code, and I believe that I might have identified a large time sink. Each time a new route is added into the route table, the routes (per host) get sorted. We almost never do host-based routing, so nearly 100% of our routes all fall into the blank host route list.
I dumped one of our route tables into a file and wrote a test to load that file and then create a route table from those commands. Something like this:
func TestNewTable(t *testing.T) {
// read in a file with approximately 20k `route add` commands
f, err := os.ReadFile("/path/to/some/fabio_routes.txt")
if err != nil {...}
b := bytes.NewBuffer(f)
_, err := NewTable(b)
if err != nil {...}
}
Without any code changes from release 1.5.15 this takes around 13 seconds on average.
=== RUN TestNewTable
--- PASS: TestNewTable (13.42s)
PASS
CPU Profiling flame graph:
When I remove the Sort call from Table.addRoute and do it at the end of NewTable instead, the test completes in less than a second on average.
=== RUN TestNewTable
--- PASS: TestNewTable (0.62s)
PASS
CPU Profiling flame graph:
So, that's a pretty significant improvement. I'm working on trying to get a more production-like test set up to measure the difference in update speed while Fabio is actually under load.
I'll have a pull request soon, and would really appreciate any feedback as well as consideration in merging and cutting a new release. Thanks!
We run several sets of Fabio instances, with the Consul backend, that register 20k and 60k routes. On the low end Fabio takes around a minute to process changes to the route table, and on the high end it's more like 5 to 6 minutes while Fabio is handling load.
I've been doing some experimenting and profiling of Fabio's code, and I believe that I might have identified a large time sink. Each time a new route is added into the route table, the routes (per host) get sorted. We almost never do host-based routing, so nearly 100% of our routes all fall into the blank host route list.
I dumped one of our route tables into a file and wrote a test to load that file and then create a route table from those commands. Something like this:
Without any code changes from release 1.5.15 this takes around 13 seconds on average.
CPU Profiling flame graph:![image](https://user-images.githubusercontent.com/82290/161848495-2aaffa5b-a990-4fa2-a60d-2575f1c26146.png)
When I remove the Sort call from Table.addRoute and do it at the end of NewTable instead, the test completes in less than a second on average.
CPU Profiling flame graph:![image](https://user-images.githubusercontent.com/82290/161848550-e5a509ed-16df-484a-aa41-94c9009ae30f.png)
So, that's a pretty significant improvement. I'm working on trying to get a more production-like test set up to measure the difference in update speed while Fabio is actually under load.
I'll have a pull request soon, and would really appreciate any feedback as well as consideration in merging and cutting a new release. Thanks!