konstructio / gitops-template

upstream template for your open source gitops repository
MIT License
67 stars 56 forks source link

fix: remove hard coded Kubernetes version for managment cluster #779

Open muse-sisay opened 2 months ago

muse-sisay commented 2 months ago

DigitalOcean ended support for Kubernetes v1.27 on 28 June 2024. As a result Kubefirst(Terraform) is not able to provision a cluster. DigitalOcean Kubernetes Supported Releases

Description

To fix this issue, this PR removes the hard coded Kubernetes version and instead use digitalocean_kubernetes_versions datasource to fetch the latest version that DigitalOcean supports. Downside of this change it that the new Kubernetes release might introduce breaking changes.

fharper commented 2 months ago

Thanks for the PR @muse-sisay 🎉

As for the change itself, we tend to stay away from using latest of anything instead of a pinned version for the reason you mention: by pinning the version we are sure that it's working properly with our stack.

I think the fix would be more like updating the version number for 1.30.2 (latest available on DigitalOcean) instead or removing the version requirement. Let me do some testing on that Kubernetes version today to ensure it would work, so we can pin it.

muse-sisay commented 2 months ago

Happy to make my first contribution to Kubefirst @fharper! Let me know how your testing goes and I can pin the version to v1.30.

I was skimming through repository and different cloud's had a different version pinned. For example in AWS directory is pinned to 1.26 while Google cloud uses "stable" release. Theoretically it Kubefirst should function similar across the different Kubernetes distribution. If we had to define the Kubernetes version in a central location where would the ideal place be?

fharper commented 2 months ago

I was skimming through repository and different cloud's had a different version pinned. It is normal as for one thing, not all clouds provide the same Kubernetes versions in the same timeframe.

Theoretically it Kubefirst should function similar across the different Kubernetes distribution. Yes, and no. Kubernetes version is rarely just a flavorless one, and to create a cluster, different clouds need different resources, so it's a case of we upgrade a version when needed per cloud. You are right that in the end, you end up with a similar experience on all clouds, but how it's working on the cloud beneath may differ.

If we had to define the Kubernetes version in a central location where would the ideal place be? We won't, for the reasons mentioned below.

So the next step is to define which version we want to pin DigitalOcean on. @jarededwards is currently testing the latest available, but stumbled upon some issues he need to validate before we can confirm how to pin it.

@jarededwards: once you are done and decided what is the next pinned version, please share it here so @muse-sisay can update his PR, and have his fix in kubefirst while becoming a new contributor 🎉

muse-sisay commented 2 months ago

Greetings @fharper,

I too ran into issues with Kubernetes v1.30 on DigitalOcean. I will copy paste my response from Slack as it also applies to other Kubernetes distrubution

Kubernetes v1.28 introduced improved failure handling for Jobs which added JobPodReplacementPolicy and 2 fields spec.podReplacmentPolicy and status.terminating. It graduated to beta in 1.29. The schema parser module in ArgoCD v2.6.4,the version used by Kubefirst, is not aware of the new "schema"/fields. Thus throws the following error

error calculating structured merge diff: error building typed value from live resource: 
errors: .spec.podReplacementPolicy: field not declared in schema .status.terminating: 
field not declared in schema

To fix the above issue, you can

  1. Disable server side apply for argocd-kustomized-app
  2. Upgrade ArgoCD (server, repo server, controller) to v2.12.0-rc3 or v2.12.0-rc4 which uses the update schema
  3. or Stay on Kubernetes v1.28 where JobPodReplacementPolicy is behind a feature flag

Note: During testing make sure to disable auto sync if you're doing click-ops!

Oldest supported release by DigitalOcean is v.128, which is EOL on 28 October 2024. I will test out 1.28 and report back.


So the next step is to define which version we want to pin DigitalOcean on. @jarededwards is currently testing the latest available, but stumbled upon some issues he need to validate before we can confirm how to pin it.

Can we have a single version across the all cloud environments? Preferably defined in a single location that all Terraform modules can reference.

fharper commented 2 months ago

Can we have a single version across the all cloud environments? Preferably defined in a single location that all Terraform modules can reference.

As explained previously, it's not possible for different reasons. So let's just be sure we pin the right version for DigitalOcean right now. Let me check with @jarededwards where he is with his testing.

fharper commented 1 month ago

Was this the version you & @jarededwards tested? Just to be sure caused I won't test it if it's already done.

jarededwards commented 1 month ago

we had a chat in our community slack with @muse-sisay and we are going to take start working on the changes to address his comment here