Azure / AKS

Azure Kubernetes Service
https://azure.github.io/AKS/
1.97k stars 307 forks source link

[Feature] AKS Auto upgrade - support for allowed or not_allowed version list for auto upgrade #4031

Open tsivachi opened 11 months ago

tsivachi commented 11 months ago

Is your feature request related to a problem? Please describe. We are currently utilizing several multi-tenant Azure Kubernetes Service (AKS) clusters, which deploy various application-specific microservices. Each microservice has its own set of Kubernetes component dependencies, and some are reliant on Commercial Off-The-Shelf (COTS) software products that are only certified for specific Kubernetes versions. Our challenge lies in ensuring system stability and compatibility during automatic AKS upgrades, given these version dependencies. To address this, we're seeking a feature that offers more control over the Kubernetes versions or timelines taken into account during the auto-upgrade process.

Describe the solution you'd like Our specific request is for a feature that lets us develop an 'allow list' and an 'exclude list' of Kubernetes versions for the auto upgrade. This would give us the flexibility to steer the auto-upgrade process and ensure it aligns with our specific needs and restrictions. For instance, as AKS platform owners, we may identify certain Kubernetes versions as upgradeable and others as non-upgradeable. AutoUpgrade configurations: Cluster auto-upgrade channel: "stable" Planned maintenance schedule: Set at Relative Monthly, 3 months interval, Last Sunday of the month. AutoUpgrade feature requested: Upgradeable versions: 1.25.x - 1.29.x and 1.31.x - 1.32.x (the reason being that these versions are compatible with our COTS product) Non-upgradeable version: 1.30.x (this version impacts our COTS product and would require an additional development cycle to address) Different Kubernetes version upgrades could lead to varying development timeline expectations to resolve API compatibility issues. The ability to adjust the timeline based on the complexity involved with a version upgrade, or to skip a specific version auto upgrade, could be advantageous. Ideally, the AKS Auto-upgrade feature should include an attribute that allows us to skip upgrading a cluster version (e.g., from 1.29.x to 1.30.x), generate an event for the AKS cluster owner to process, and let the AKS Cluster owner plan their upgrades in line with the dependency complexities associated with that version.

Describe alternatives you've considered An alternative could be to monitor the API compatibility issues in advance and adjust the upcoming auto upgrade maintenance window schedule to be delayed. However, we may not always have a confirmed date for when dependency fixes will be available.

Believe that this feature would not only benefit us, but also other AKS users facing similar challenges.

Thanks, Thaniga

serbrech commented 9 months ago

Not addressing your feature request, but just this section:

generate an event for the AKS cluster owner to process

https://learn.microsoft.com/en-us/azure/event-grid/event-schema-aks?tabs=event-grid-event-schema#example-events

kaarthis commented 7 months ago

This is an interesting idea though not in the immediate roadmap for now. We are focusing on unsupported clusters and how best to bring them into supported version.