buchgr / bazel-remote

A remote cache for Bazel
https://bazel.build
Apache License 2.0
595 stars 154 forks source link

Unbounded number of connections to the HTTP proxy #678

Open gabrielrussoc opened 1 year ago

gabrielrussoc commented 1 year ago

Hi all,

I started exploring a two-layer deployment of a bazel remote where the first layer points to a second deployment of the bazel remote via the --http_proxy.url flag. I noticed a very high CPU usage on the first layer coming from dialling TCP connections:

Screenshot 2023-07-19 at 15 54 58

It turns out the proxy uses the default http client, which under the hood uses the default transport that does not limit the number of connections: https://github.com/buchgr/bazel-remote/blob/46ab6a35b660fbfc8869970d805f81eaebec8e10/config/proxy.go#L29

From https://pkg.go.dev/net/http:

// MaxConnsPerHost optionally limits the total number of
// connections per host, including connections in the dialing,
// active, and idle states. On limit violation, dials will block.
//
// Zero means no limit.
MaxConnsPerHost int

Should we expose a knob to control this value and limit the amount of connections we open? In practice this ended up hogging all the CPU on the machine which causes all sorts of weirdness. It sounds reasonable to optionally block new connections instead, and perhaps even to introduce a metric to show the amount of active requests to the proxy

Happy to send a PR if this all makes sense

mostynb commented 1 year ago

Sounds like a nice feature to have. Is there a reasonable heuristic that could set this value automatically, or would this only be useful with a new configuration setting?

gabrielrussoc commented 1 year ago

We could try for some multiple of the number of available CPUs, but this could cause unexpected regressions to folks already relying on the unbounded behaviour. I'd rather keep this obvious behind a flag to avoid any surprises when upgrading

mostynb commented 1 year ago

OK, sounds reasonable.