fix: default to multizone karpenter deployment

Name	Link
Latest commit	5525628c4ac860dcfba53f91b63712d5b9d5d1a8
Latest deploy log	https://app.netlify.com/sites/karpenter-docs-prod/deploys/6669ed525494310008768b23

Pull Request Test Coverage Report for Build 9488069911

Details

0 of 0 changed or added relevant lines in 0 files are covered.
1 unchanged line in 1 file lost coverage.
Overall coverage decreased (-0.01%) to 82.467%

Files with Coverage Reduction	New Missed Lines	%
pkg/providers/amifamily/ami.go	1	90.56%
<!--	Total:	1		-->

Totals
Change from base Build 9475149132:	-0.01%
Covered Lines:	5550
Relevant Lines:	6730

I'm on board with changing this to be the default, but should we surface a configuration option in the helm chart to override this default? If we did this from the start I'd defer to user demand, but this would now be a breaking change for customers who may have infra already set up relying on this being a ScheduleAnyway TSC.

rschalo commented 3 months ago

So, my thinking is that if leave this as is and don't surface an option to override, there are two kinds of cx that would break. The first are those that would already break given a zonal outage, which would not different behavior with DoNotSchedule as the default. The other case, is that Karpenter is running two replicas on the same node, the leader pod goes unhealthy, loses leadership, and then there is not a second node because of the DoNotSchedule TSC.

This PR changes the value which can be changed by cx, setting this by default requires a chart default update that we can revisit later but I think changing the values and making this recommendation is sufficient for now.

aws / karpenter-provider-aws