Open ulrfa opened 3 years ago
We could implement this naively, but that would mean adding more mutexes around using the settings which may change at any time (or using sync/atomic to access a struct).
I wonder if there's a way to avoid that overhead at the cost of some slight downtime when reloading, for example by restarting the http/grpc servers? If that's too difficult then maybe we can implement a fast-reload option: stop accepting requests, dump the index to disk, restart bazel-remote and import the index.
Thanks Mostyn,
I'm thinking for example allow creating and replacing cache.Proxy
instances at runtime. And protect the reference to current proxy instance in disk.go with a mutex.
And perhaps in a similar way creating and replacing instances of metrics.Metrics interface: (https://github.com/buchgr/bazel-remote/blob/25e244e035a7364a4022187bb7a131e8c4b41c6f/utils/metrics/metrics.go#L75-L78)
As long as replaced parts have well defined interfaces, and not too much dependencies, I think they could be replaced in runtime, without too much added complexity.
For me it would not be OK with a slight downtime when reloading, since that could cause ongoing remote execution builds to fail. (Downtime could cause failed builds also in pure cache scenarios for those using “builds-without-the-bytes”, unless https://github.com/bazelbuild/bazel/issues/10880 is resolved)
I will not have time to implement anything of the above now, but I wanted to raise this as background to the discussion in https://github.com/buchgr/bazel-remote/pull/350 about if Prometheus label configuration should be in a separate configuration file or not.
I haven't thought this through but if you want to avoid downtime, could a specialized proxy work? ie receive requests from clients, and forward them on to bazel-remote, with retries if bazel-remote stops accepting requests temporarily.
Sometimes I would like to change bazel-remote’s configuration. But restarting bazel-remote would interfer with ongoing builds.
It would be nice if a configuration file could be re-loaded by a SIGHUP signal.
Allowing any configuration parameter to change anytime, might be too complicated, but perhaps support re-loading a few specific parameters? And log a warning if other parameters also changed in configuration file?
Use cases:
Enable/disable access log. Today I redirect stderr to file, and the access log on stdout to /dev/null, since the later produces too much data. But ocassionally I would like to temporarily enable also the access log for a short period of time, without restarting bazel-remote.
Change prometheus label configuration without restarting bazel-remote (see discussion in https://github.com/buchgr/bazel-remote/pull/350).
Set, unset and change http proxy url, without restarting bazel-remote. E.g. if the parent cache goes down or is moved to another URL. Or when migrating users from one bazel-remote instance to another, by temporarily configuring the instances as proxies for each other, for smoth migration. (Ongoing remote execution builds fails if input files from client to old instance, is not avilable for remote executor on new instance)
Change configuration dynamically to reduce proxy load on parent cache instance. If parent cache becomes overloaded. (I'm not sure about if that make sense or not, the other use cases are more important. See https://github.com/buchgr/bazel-remote/issues/351)