In short: In addition to scaling when there's a lot of memory used by postgres, we should also scale up to make sure that enough of the LFC is able to fit into the page cache alongside it.
To answer "how much is enough of the LFC", this PR takes the minimum of the estimated LFC working set size (from window size) and the cached memory (from the Cached field of /proc/meminfo, via vector's host metrics).
This PR also adds the memoryTotalFractionTarget field to the scaling config, serving a similar purpose to memoryUsageFractionTarget, but applying to the total of memory usage and cached data.
This PR is part of #1030 and must be deployed before the vm-monitor changes in order to make sure we don't regress performance for workloads that are both memory-heavy and rely on LFC being in the VM's page cache.
This PR is broken into two commits -- the first is a refactor to make the second one easier to read.
Planning to test on staging with neondatabase/neon#8668 using some familiar workloads (in particular: LFC-aware scaling tests and pgvector index build)
In short: In addition to scaling when there's a lot of memory used by postgres, we should also scale up to make sure that enough of the LFC is able to fit into the page cache alongside it.
To answer "how much is enough of the LFC", this PR takes the minimum of the estimated LFC working set size (from window size) and the cached memory (from the
Cached
field of/proc/meminfo
, via vector's host metrics).This PR also adds the
memoryTotalFractionTarget
field to the scaling config, serving a similar purpose tomemoryUsageFractionTarget
, but applying to the total of memory usage and cached data.This PR is part of #1030 and must be deployed before the vm-monitor changes in order to make sure we don't regress performance for workloads that are both memory-heavy and rely on LFC being in the VM's page cache.
For more info, see: https://www.notion.so/neondatabase/0f75b15d47ad479094861302a99114af.
Notes for review: