gforcada / haproxy_log_analysis

HAProxy log analyzer
https://pypi.org/project/haproxy_log_analysis
GNU General Public License v3.0
88 stars 35 forks source link

Module to help select a value for maxconn based on correlation between Tr and beconn #6

Open timbunce opened 10 years ago

timbunce commented 10 years ago

A given backend server can typically only handle a certain number of requests before it becomes overloaded. The haproxy maxconn setting can be used to limit the number of connections to just before that point. The trick is finding the right value.

It ought to be possible to at least suggest a value based on the correlation between the request response time (Tr) and the number of connection to the backend (beconn).

See this article for a nice graph showing what I mean.

It seems that this would be a natural fit for haproxy_log_analysis.

Sadly my maths/stats isn't up to the job of recommending how to do a good analysis of the (very noisy) correlation, but even a poor analysis would probably be very useful.

Simply generating a table like the one in the article would probably be sufficient. That would require bucketing the value of beconn in say units of 10, and then for each log record calculate the median for the Tr for that bucket. At the end of crunching the logs for a period that covers high and low traffic you'd print a table of beconn and the median Tr. I'd expect that the table for a host that's getting overloaded would show a clear 'knee' in the median Tr when it reaches the max safe load - and that's the value to use for maxconn.

If the median queue size (svr_queue) was also calculated per beconn bucket then it would be easy to spot if maxconn has already been set but set too low. In that case the svr_queue would start rising but without a rise in Tr.

The ideal setting for maxconn would be one that shows a small but harmless increase in Tr.

gforcada commented 10 years ago

That's a really nice feature to add for sure.

Thanks for taking the time to properly explain it.

Regretfully I don't have time, now, to work on it, as you probably have seen I've not been working on it since long and I, hopefully, don't expect to work much more on it either.

Though I welcome any kind of contribution, be it rough or perfect, big or small I will try to help on bringing everything together.

I created a release on PyPI so that if you want to use it (I hope you do :) you do not need to do a git checkout: https://pypi.python.org/pypi/haproxy_log_analysis

norcis commented 8 years ago

Great idea! Any news about it?

gforcada commented 4 years ago

I just finished rewriting haproxy_log_analysis completely, released as version 4.0.0, and although it still does not have such a command, it is now easier than ever to write new commands, if anyone is still interested in having such a command, now is the perfect time :+1: