Include Overhead Pct in provisioning logs.

Description

What problem are you trying to solve? I've been troubleshooting a provisioning issue for days. Instances would not get scheduled within my requirements, and it was not clear why.

I checked my daemonset overhead + pod requests and I would be under the limit for a 4xlarge type instance, but always be provisioned on an 8xlarge. If I had tight constraints on a provisioner, I would get no instance at all, and pods would be stuck pending.

It turns out the vmMemoryOverheadPercent was bumping me up just barely over the 64GB memory barrier, but this information is completely invisible through event streams and provisioner logs.

Example:

Expected Capacity Requirement -> expects 4xlarge: 
DS Overhead (2GB) + Pod(user) Request (60GB) = 62GB 
Actual Capacity Requirement -> requires 8xlarge:
DS Overhead (2GB) + Pod(user) Request (60GB) + vmMemoryOverHeadPercent (62*0.075) = 64.65GB

How important is this feature to you? Very - my end users have visibility into event streams to understand provisioning, and seeing this info would have made it clear why pods were not scheduling.

Ideal solution...

Include the VM Overhead in this log / event.

incompatible with provisioner "gpu", daemonset overhead={"cpu":"725m","memory":"1308Mi","pods":"13"}, no instance type satisfied resources {"cpu":"1735m","memory":"58692Mi","nvidia.com/gpu":"1","pods":"14"}

aws / karpenter-provider-aws

Include Overhead Pct in provisioning logs. #4456

Description

Ideal solution...