braintree / litmus_paper

Backend health tester for HA Services
MIT License
32 stars 15 forks source link

Add active and queued requests to text output of unix socket metric #43

Closed mikesphar closed 5 years ago

mikesphar commented 5 years ago

The UNIX socket metric makes use of active and queued socket connections as part of calculating the final weight. Those active and queued metrics can be valuable as they provide insight into the number of received but not-yet processed requests. By exposing those metrics in the text output detail, other utilities can extract that information from outside the host running the service. This can be especially useful when the service is running in a container, it allows services outside the container such as monitoring/metrics agents to collect this information without needing direct access to the host or container itself.

These metrics are also valuable for application of some queuing theory formulas, such as Little's Law. Active + Queued requests provide the L in Little's formula. And since the metric also exposes the maximum number of connections, this allows a third-party service such as a monitoring agent to use (active + queued) / max to calculate how close to maximum utilization the service running in the container is.

mikesphar commented 5 years ago

@lollipopman , @ssgelm !!!

mikesphar commented 5 years ago

Something that occurred to me while I was looking at this code, the way it's written now (even before this change) it appears the _stats function is being called multiple times, getting a different "snapshot" of the stats each time. And if I merge this as-is it's going to call it two more times to get the active and queued stats every time the .to_s is called.

I suspect the correct behavior here is to define instance variables for @active and @queued, set those in the current_health method like @weight, then change the weight calculation and the to_s method to all use the instance variable rather than calling the _stats method multiple times.

bjhaid commented 5 years ago

There's a variable here:

https://github.com/braintree/litmus_paper/blob/16c4d92400c2aed31243361e2f579a41960cb89b/lib/litmus_paper/metric/unix_socket_utilization.rb#L13

that should be used in place of all the various _stat calls here:

https://github.com/braintree/litmus_paper/blob/16c4d92400c2aed31243361e2f579a41960cb89b/lib/litmus_paper/metric/unix_socket_utilization.rb#L19-L20

mikesphar commented 5 years ago

@bjhaid Ha! I had not seen your comment yet but I had already noticed that reuse of the _stats method and was refactoring to reduce that.

bjhaid commented 5 years ago

The change LGTM as it is