nutanix-cloud-native / cluster-api-runtime-extensions-nutanix

https://nutanix-cloud-native.github.io/cluster-api-runtime-extensions-nutanix/
Apache License 2.0
7 stars 4 forks source link

fix: CRS generated CA Deployment has extra quotes #867

Closed dkoshkin closed 3 weeks ago

dkoshkin commented 3 weeks ago

What problem does this PR solve?: Fixes an issue with CRS genertated CA Deployment not working because of extra quotes.

I0815 20:12:58.376871       1 reflector.go:332] Listing and watching cluster.x-k8s.io/v1beta1, Resource=machines from k8s.io/client-go/dynamic/dynamicinformer/informer.go:108
W0815 20:12:58.379140       1 reflector.go:547] k8s.io/client-go/dynamic/dynamicinformer/informer.go:108: failed to list cluster.x-k8s.io/v1beta1, Resource=machines: machines.cluster.x-k8s.io is forbidden: User "system:serviceaccount:default:cluster-autoscaler-0191579a-2104-7ace-a5a2-ceae4590d7fe" cannot list resource "machines" in API group "cluster.x-k8s.io" in the namespace "'default'"
E0815 20:12:58.379170       1 reflector.go:150] k8s.io/client-go/dynamic/dynamicinformer/informer.go:108: Failed to watch cluster.x-k8s.io/v1beta1, Resource=machines: failed to list cluster.x-k8s.io/v1beta1, Resource=machines: machines.cluster.x-k8s.io is forbidden: User "system:serviceaccount:default:cluster-autoscaler-0191579a-2104-7ace-a5a2-ceae4590d7fe" cannot list resource "machines" in API group "cluster.x-k8s.io" in the namespace "'default'"

The extra quotes are only an issue in --node-group-auto-discovery=, but the same script replaces all occurrences. It should be safe to drop the single quotes everywhere though because the name and namespace will be strings and won't be interpreted as numbers by yaml.

Which issue(s) this PR fixes: Fixes #

How Has This Been Tested?:

Special notes for your reviewer:

~I think we can improve the e2e tests by checking that all Deployments are Ready on the self-managed clusters, but I did not do that as part of this PR.~ The tests already do that https://github.com/nutanix-cloud-native/cluster-api-runtime-extensions-nutanix/blob/f632224257cb159b04abc2c6c6eb6874c503bb1c/test/e2e/clusterautoscaler_helpers.go#L115

Instead we can also wait for the status ConfigMap to be Running This is what the data will contain for a working Deployment:

data:
  status: |+
    time: 2024-05-22 19:33:34.074058252 +0000 UTC
    autoscalerStatus: Running

And for non working:

data:
  status: |+
    time: 2024-08-15 20:07:37.204175116 +0000 UTC
    autoscalerStatus: Initializing
jimmidyson commented 3 weeks ago

@dkoshkin So with this error the deployment shows as ready but it's not working properly so we should check the status configmap too?

dkoshkin commented 3 weeks ago

@dkoshkin So with this error the deployment shows as ready but it's not working properly so we should check the status configmap too?

Yep! Attempt on the check here https://github.com/nutanix-cloud-native/cluster-api-runtime-extensions-nutanix/pull/867/commits/f203f8acf2cef15edefe5fcb098e4f81b3b954ea.

dkoshkin commented 3 weeks ago

The new ConfigMap test failed as expected without this fix https://github.com/nutanix-cloud-native/cluster-api-runtime-extensions-nutanix/pull/868#issuecomment-2292675203