Open stobias123 opened 1 year ago
So I looked into this a little bit. Currently the VMMemoryOverhead percent is in the AWS portion of the ConfigMap Settings. This means that only the aws/karpenter code is aware of this concept, where this log line is in the karpenter-core code. I could see us eventually moving this out to the core-settings once we're sure this is a common mechanism across cloud providers, but we'd need to do this first before we can include this in the log line.
Description
What problem are you trying to solve? I've been troubleshooting a provisioning issue for days. Instances would not get scheduled within my requirements, and it was not clear why.
I checked my daemonset overhead + pod requests and I would be under the limit for a
4xlarge
type instance, but always be provisioned on an8xlarge
. If I had tight constraints on a provisioner, I would get no instance at all, and pods would be stuck pending.It turns out the
vmMemoryOverheadPercent
was bumping me up just barely over the 64GB memory barrier, but this information is completely invisible through event streams and provisioner logs.Example:
How important is this feature to you? Very - my end users have visibility into event streams to understand provisioning, and seeing this info would have made it clear why pods were not scheduling.
Ideal solution...
Include the VM Overhead in this log / event.