neondatabase / autoscaling

Postgres vertical autoscaling in k8s
Apache License 2.0
144 stars 18 forks source link

Metrics and clearer logs for denied scaling #940

Open sharnoff opened 2 months ago

sharnoff commented 2 months ago

Problem description / Motivation

We don't have clear signals for when the scheduler denies upscaling, or the vm-monitor denies downscaling.

This came up here: https://neondb.slack.com/archives/C03TN5G758R/p1716167732610919?thread_ts=1716166712.233079

Feature idea(s) / DoD

The autoscaler-agent can:

  1. Expose new instance-level metrics for amount of scaling requests that were not satisfied
  2. Change the log level of scheduler responses that didn't grant the full request to "warn", rather than just "info"

Implementation ideas

Should be simple enough to inject these alongside the existing logs & metrics, wherever they're defined (IIRC exec_bridge / executor ?)