fluxcd / terraform-provider-flux

Terraform and OpenTofu provider for bootstrapping Flux
https://registry.terraform.io/providers/fluxcd/flux/latest
Apache License 2.0
368 stars 86 forks source link

[Bug]: Kustomization & Bucket Object API Version automatically getting changed from v1beta1 to v1 & v1beta2 resp. #716

Closed satyamsareen007 closed 2 months ago

satyamsareen007 commented 2 months ago

Describe the bug

Hi Team,

Background:

We are running Flux 2.3.0 on EKS. We have used the bootstrap resource to install Flux.

Recently we have upgraded Flux from 2.2.0 --> 2.2.3 --> 2.3.0.

When we were at Flux version 2.2.0, we didn't used to use the bootstrap resource and used the flux_install datasource. During our migration to the bootstrap resource while upgrading Flux to 2.2.3, we couldn't import our Flux installation due to some config issues and had to clean up and bootstrap a fresh installation of Flux.

Issue: We have started to observe a very strange issue with our Kustomization & Bucket objects, where their API versions have automatically gotten changed from v1beta1 to v1 & v1beta2 resp.

We create our Kustomization & Bucket objects using the kubectl tf provider and over there we have mentioned v1beta1 API version. The terraform state also mentions v1beta1 API version for the live cluster state, but the actual Kustomization & Bucket objects have v1 & v1beta2 API versions resp. in them.

This is happening for newly created Kustomization & Bucket objects as well.

This drift is also not being captured in terraform plan.

Can you please help us in capturing what could be causing this and fix this?

kubectl ouptut for object in cluster:

ssareen@ssareen-mlt ~ % k get kustomization -n flux-cd  apps-ucp-service  -o yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"kustomize.toolkit.fluxcd.io/v1beta1","kind":"Kustomization","metadata":{"annotations":{},"name":"apps-ucp-service","namespace":"flux-cd"},"spec":{"interval":"1m0s","path":"./apps/stg/us-east-1/ucp-service","prune":true,"sourceRef":{"kind":"Bucket","name":"stg-us-east-1-ucp"},"validation":"client"}}
  creationTimestamp: "2024-08-23T10:57:21Z"
  finalizers:
  - finalizers.fluxcd.io
  generation: 1
  name: apps-ucp-service
  namespace: flux-cd
  resourceVersion: "785636986"
  uid: f8c1b8bf-b917-4e48-a41f-929f22d71a3d
spec:
  force: false
  interval: 1m0s
  path: ./apps/stg/us-east-1/ucp-service
  prune: true
  sourceRef:
    kind: Bucket
    name: stg-us-east-1-ucp
status:
  conditions:

resource config in terraform

apiVersion: kustomize.toolkit.fluxcd.io/v1beta1
kind: Kustomization
metadata:
  name: apps-ucp-service
  namespace: flux-cd
spec:
  interval: 1m0s
  sourceRef:
    kind: Bucket
    name: stg-us-east-1-ucp
  path: ./apps/stg/us-east-1/ucp-service
  prune: true
  validation: client

We can see in the last applied config, api version was v1beta1. We are not able figure what is changing this api version to v1. And .spec.validation attribute also gets removed as it is not supported in v1 kustomization.

Steps to reproduce

Have Flux 2.2.0 installed using the flux_install datasource Have Kustomization & Bucket objects deployed using kubectl tf provider with versions v1beta1 Clean up Flux Bootstrap a fresh installation of FLux using version 2.2.3 and then upgrade to 2.3.0 API version changes for Kustomization & Bucket objects from v1beta1 to v1 & v1beta2 resp automatically.

Expected behavior

Kustomization & Bucket objects API versions to not change from v1beta1 to v1 & v1beta2 resp automatically

Screenshots and recordings

No response

Terraform and provider versions

provider version 1.3.0 tf version 1.8.0

Terraform provider configurations

provider "flux" {
  kubernetes = {
    config_path = "${path.root}/../config.yaml"
  }
  git = {
    url         = "ssh://git@gitlab.com/${data.terraform_remote_state.gitlab_deploy_key.outputs.gitlab_project_path_with_namespace}.git"
    branch      = "main"
    author_name = "${var.project_name}-flux-${var.region}-${var.environment}"
    ssh = {
      username    = "git"
      private_key = data.terraform_remote_state.gitlab_deploy_key.outputs.gitlab_deploy_key_tls_private_key
    }
  }
}

flux_bootstrap_git resource

resource "flux_bootstrap_git" "bootstrap" {
  path                 = var.eks_cluster_name
  namespace            = "flux-cd"
  version              = var.flux_version
  watch_all_namespaces = false
  keep_namespace       = true
  depends_on = [kubernetes_namespace.flux_cd]
}

Flux version

v2.3.0

Additional context

No response

Code of Conduct

Would you like to implement a fix?

No

stefanprodan commented 2 months ago

We can see in the last applied config, api version was v1beta1. We are not able figure what is changing this api version to v1.

This how Kubernetes works, once a CRD comes with a new version, that version is served by the Kubernetes API for all newly created and updated custom resources.

satyamsareen007 commented 2 months ago

Hi Stefan,

Thanks for the response.

I am a CRD beginner so really not sure how API versioning works in CRDs, so couldn't completely understand your response.

I still have a couple of questions:

1) Why API version is changing automatically and who is changing it, this could break things right for eg. if someone was using patchesStrategicMerge in their Kustomization object and if the API version suddenly changes to v1, reconciliation would break right as patchesStrategicMerge is not supported in v1.

2) When we are explicitly specifying v1beta1, then why API version is changing, we didn't request this API version anywhere. And as v1beta1 is still supported in Flux 2.3.0, then why Flux is changing the API version? Is this mentioned in any upgrade guides that this will happen?

3) Why this change in the API version is not coming as a diff when we run a tf plan for our Kustomization and Bucket objects via the kubectl provider. The terraform state still reports the API version as v1beta1 but they are actually at v1.

We are really concerned about this sudden API version change, can you please explain at a low level what is happening behind the scenes?

Would really appreciate the help, Thanks.

stefanprodan commented 2 months ago

And as v1beta1 is still supported in Flux 2.3.0

It is in no way supported, it's there so you can upgrade to GA. v1beta1 API has been deprecated long time ago. Kubernetes does not stores in etcd more than one version for custom resources.

satyamsareen007 commented 2 months ago

Why API version is changing automatically and who is changing it, this could break things right for eg. if someone was using patchesStrategicMerge in their Kustomization object and if the API version suddenly changes to v1, reconciliation would break right as patchesStrategicMerge is not supported in v1.

how this scenario would be handled then?

stefanprodan commented 2 months ago

how this scenario would be handled then?

Like with all CRD upgrades, before updating Flux, you need to migrate your current CRs and get rid of the deprecated fields. Before going to Flux v2.0.0, you would move the patches from .spec.patchesStrategicMerge to .spec.patches, Kustomization v1beta2 has both of them so doing this with Flux v0.x would ensure a smooth upgrade to GA.

satyamsareen007 commented 2 months ago

thanks Stefan,

just a couple of final questions to get my understanding of flux and it's CRD upgrades more clear,

how do we anticipate if the API version of custom objects will automatically change after the flux upgrade? Is there somewhere we can look at (any documentation currently present) that tells us that old API versions will automatically get changed? Till now I was under the impression, that we change the API version explicitly of our custom objects and then apply them to reflect the changes. Is this documented somewhere, as it might come as a surprise for all a lot of us.

Should the deprecated fields be changed after the upgrade or before the upgrade of flux

stefanprodan commented 2 months ago

We document all changes in the release notes: https://github.com/fluxcd/flux2/releases

We also document the upgrade procedure, example here https://fluxcd.io/blog/2024/05/flux-v2.3.0/#installing-or-upgrading-flux

From the example:

Before upgrading, ensure that the HelmRelease v2beta2 YAML manifests are not using deprecated fields.

satyamsareen007 commented 2 months ago

thanks, Stefan, for providing the links

apologies for asking similar questions,

I couldn't find any info about the API versions of the custom resources being automatically getting changed when applied after the upgrade in the above links.

Can you please explain your reply in more detail as I am to new CRDs. What does a new version mean in CRD as there could be lot of beta, alpha & GA versions present in a CRD.

This how Kubernetes works, once a CRD comes with a new version, that version is served by the Kubernetes API for all newly created and updated custom resources.

Please correct me if i am wrong, I can see 3 versions in the Kustomization CRD

v1 (served:true, storage: true) v1beta2 (served:true, storage: false, deprecated: true) v1beta1 (served:true, storage: false, deprecated: true)

and before the upgrade my Kustomization resources were at v1beta1

when i recreated them after upgrade, they went to v1

so does this mean CRDs will choose the API version that has served:true and storage: true set in it and is not deprecated, over the API version that we have in our desired configuration.

swade1987 commented 2 months ago

Hi @satyamsareen007, how CRDs work is outside the scope of the Flux documentation since CRDs are a core Kubernetes concept. There are plenty of great resources available on Kubernetes CRDs, and I’d recommend starting there for a deeper understanding. I’ll go ahead and close this issue as it doesn’t directly pertain to Flux itself. Thanks for raising it, and feel free to reach out if you have any Flux-specific questions.

satyamsareen007 commented 2 months ago

Hi @swade1987

I couldn't find any info/docs talking about this from @stefanprodan 's reply

This how Kubernetes works, once a CRD comes with a new version, that version is served by the Kubernetes API for all newly created and updated custom resources.

Can you please elaborate more on this? This might help us understand why our API versions are changing.