Support cordon-only mode on spot interruption for cost efficiency

Description

What problem are you trying to solve?

We run the GitHub Actions self-hosted runners on spot instances. When a spot interruption occurs, Karpenter evicts a runner Pod to a new Node. It wastes our EC2 cost, because the runner Pod is not re-runnable.

That is,

The controller (actions-runner-controller) creates a runner Pod.
A spot interruption is occurred in AWS.
Karpenter evicts the runner Pod to a new Node. This may launch an EC2 instance. 💰
A new runner Pod is started but finally exited with an error that is not re-runnable.

It would be nice if a NodePool supports cordon-only mode instead of eviction. I found the related issue https://github.com/aws/karpenter-provider-aws/issues/3604.

How important is this feature to you?

This feature reduces our EC2 cost, because no new instance is launched upon a spot interruption.

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

aws / karpenter-provider-aws

Support cordon-only mode on spot interruption for cost efficiency #7024

Description