Vertical Pod Autoscaling for Refinery

So when I set up my Refinery clusters (8 environments, one per), I used the advice in the documentation to size my cache. I haven't re-sized my caches since.

I noticed (albeit much later) that my environment Refineries have drastically different cpu/memory profiles (not entirely unreasonable) but want to free myself from managing the resource requests (as I do of all things). As I have done for many other cases (especially for Go workloads), I looked into vertical pod autoscaling.

If you are unfamiliar with VPA, the premise is that a histogram of utilisation for both cpu/memory (8 days, iirc) is used to calculate recommendations, lower bounds and upper bounds for container resourcing requests; it also tracks for OOM killed events. It's possible to configure it so that the recommendations are applied during pod scheduling, or to automatically evict the workload to make a change (voluntary disruptions). I know that the Refinery docs recommend against downscaling the HPAs due to trace-loss, so that would have to be thought about if one were to use the Auto mode. The other mode is Off which only calculates recommendations which is good for reading from other tools.

It looks like if Refinery were to infer MaxAlloc from the container resource limit, and calculate CacheCapacity from that, when the resource request/limit is changed by e.g. VPA (or even manually), both the MaxAlloc and CacheCapacity would adjust themselves automatically. It's fairly straight forward to take this information from the cgroups, and you may already be doing so for GOMAXPROCS. I accept that it wouldn't be one-size-fits.

automaxprocs is also relevant in that it sets the number of max procs based on the CPU quota (which VPA would adjust).

honeycombio / refinery

Vertical Pod Autoscaling for Refinery #605