Open mythi opened 1 month ago
peerpod-ctrl is meant to reap orphan VMs. Also using node extended resources for capacity management and a webhook to mutate the pod spec to add peerpod node extended resources will avoid resource misuse. Ref- https://github.com/confidential-containers/cloud-api-adaptor/tree/main/src/webhook
This needs to be added to the website instructions. cc @surajssd
Even without the peerpod-ctrl the VMs should be garbage collected (unless the CAA daemonset is in a crash-loop too and loses the state). It would be interesting how CAA would end up in a state where this happens. If peerpod-ctrl is the only way to get VMs + resources removed reliably we need to make it part of the normal installation routine, IMO.
I had set up CAA on Azure following the website instructions. The nginx deployment worked fine for some time but something makes the pod crashing/restarting and eventually it gets stuck with
ContainerCreating
.Taking a closer look, I can see
Current Limit: 350, Current Usage: 350, Additional Required
Azure VM resource quote limit exceeded and lots of peer-pods VM running.Checking the peer-pods daemonset logs, I can see