Open L3tum opened 3 years ago
There has not been any activity to this issue in the last 14 days. It will automatically be closed after 7 more days. Remove the stale
label to prevent this.
Sorry for the late reaction! That's a great suggestion and would certainly be useful in highly dynamic environments. :+1: I cannot make any promises as to when (or even if) any of us might get on this, but in the meantime I'd happy review any PR coming my way concerning this feature.
Hey, that's good to hear! I'm not really the Go expert, but I'll see if I can take a jab at it. Does Go offer any niceties for this behaviour? In C# for example I'd start a background thread that is sleeping and checking on a shared object every ~100ms (or more or less often, depending how well this works) and the endpoint listener would then update that shared object with the new endpoint configuration (and maybe signal the background thread with a boolean that it was updated).
I think the best point to start would probably be the VarnishController.watchConfigUpdated
and/or VarnishController.rebuildConfig
functions. The former contains the main loop which single-threadedly joins the various configuration updates from multiple sources and would probably be a reasonable place to introduce a rate-limiting of config updates, whereas the latter contains the actual VCL update logic.
For the actual rate limiting, a structure like rate.Limiter
looks like it might do the job. However, any implemented solution should assert that...
time.Sleep
in the main loop won't work, as that will clog the channels that feed into that function, and also the goroutines/threads that write into those.VarnishController
itself (since all updated structures are copied there within the main loop, which should thus remain unthrottled), meaning that even when a rebuildConfig
call is delayed, it will have a current config when eventually executed.
Is your feature request related to a problem? Please describe. Aggressive autoscaling can trigger config updates multiple times rendering Varnish unable to respond.
Describe the solution you'd like Have a configurable grace period before issuing a config update and discard old ones if another change was registered.
Describe alternatives you've considered Less aggressive autoscaling or just living with it, but both don't seem that hot.
Additional context We have pretty aggressive autoscaling since we have to get from 0-10000 Requests per Second in 2 Minutes. However, because that autoscaling is so aggressive, new varnish instances keep coming up, which results in config updates. Each config update adds a bit of latency, and if a lot of varnishes go online in a short period, the latency can shoot up to ~1s, while Varnish itself can serve the request in ~3ms.