neondatabase / autoscaling

Postgres vertical autoscaling in k8s
Apache License 2.0
150 stars 20 forks source link

agent: Refactor metrics definitions #892

Closed sharnoff closed 4 months ago

sharnoff commented 5 months ago

Mostly just extracted from #737, thought it would be useful for https://github.com/neondatabase/autoscaling/pull/878#issuecomment-2033387609, and in general nice to have.

We can also rebase #750 on top of this now.


Changes are separated into three commits, best reviewed separately. I think they're probably best merged via rebase, but I also don't want to spam the repo history :sweat_smile:

LMK what you think.

sharnoff commented 5 months ago

Do I understand correctly, api.Metrics will be a subset of core.Metrics -> e.g. LFC metrics will be collected by agent, but won't be sent over to scheduler?

That's exactly it, yeah.

Potentially we may have a separate metrics type for LFC metrics, if we're collecting it at a different interval, but either way the system metrics that the agent uses (1-minute load average + memory usage) are a superset of what the scheduler wants (just 1-minute load average).

sharnoff commented 5 months ago

Updated the message for the second commit, now includes:

This commit keeps api.Metrics for the second purpose, but makes a copy of the same type as core.Metrics so they're no longer linked. So, going forward, api.Metrics will be a subset of core.Metrics, because the scheduler only uses 1-minute load average, whereas the autoscaler-agent uses both 1-minute load average and memory usage.

LMK what you think

sharnoff commented 4 months ago

Merging, discussed via DM w/ @Omrigan