cloudnative-pg / charts

CloudNativePG Helm Charts
Apache License 2.0
135 stars 61 forks source link

gkeEnvironment conflict? #311

Open rocket357 opened 3 weeks ago

rocket357 commented 3 weeks ago

https://github.com/cloudnative-pg/charts/blob/b378ad06aee0e9902eaaf8413784e5e3bb724861/charts/cluster/templates/_barman_object_store.tpl#L64-L67

I'm setting up a test environment in gke, and I've set the following in values:

backups:
  enabled: true
  provider: google
  google:
    gkeEnvironment: true
    bucket: my-test-bucket
    path: my-test-path

Via the linked template above, I see the following:

        googleCredentials:
        gkeEnvironment: false
        applicationCredentials:
          name: cnpg-database-cluster-backup-google-creds
          key: APPLICATION_CREDENTIALS

With this configuration, I'm seeing the following in helm output:

COMBINED OUTPUT:
  Error: UPGRADE FAILED: release cloudnative-pg-database-cluster failed, and has been rolled back due to atomic being set: cannot patch "cloudnative-pg-database-cluster" with kind Cluster: admission webhook "vcluster.cnpg.io" denied the request: Cluster.cluster.cnpg.io "cloudnative-pg-database-cluster" is invalid: spec.backupConfiguration.googleCredentials: Invalid value: v1.GoogleCredentials{ApplicationCredentials:(*v1.SecretKeySelector)(0xc001e0ada0), GKEEnvironment:true}: if gkeEnvironment is true, secret with credentials must not be provided

So it looks like the docs are not entirely accurate (specifically: "GoogleCredentials is the type for the Google Cloud Storage credentials. This needs to be specified even if we run inside a GKE environment.")

The operator I'm running is cloudnative-pg latest (0.21.4). Is there additional configuration I've overlooked?

jsilvela commented 3 weeks ago

Hi @rocket357 , I'm reviewing the documentation and the webhook code, and I think the documentation is not wrong (though it could stand extra detail). The problem reported by the webhook is bona fide:

if gkeEnvironment is true, secret with credentials must not be provided

Now, looking at your case, you've supplied gkeEnvironment: false plus credentials, so the warning message would seem out of place. But, in the tpl file I see gkeEnvironment: {{ .scope.google.gkeEnvironment }}.

And the value parsed shows the GKE is true.

v1.GoogleCredentials{ApplicationCredentials:(*v1.SecretKeySelector)(0xc001e0ada0), GKEEnvironment:true}

So, indeed, the webhook rightly nixed this. And the documentation,

This [section] needs to be specified even if we run inside a GKE environment.

is also right. This seems to be an issue with the helm chart, or with the helm chart documentation.

jsilvela commented 3 weeks ago

Other topic, but 1.21 operator is old, reaching End Of Life. Releases 1.22, or especially, 1.23, would be ideal.

rocket357 commented 3 weeks ago

Hello! Thanks for your response. I copy/pasted a few things while I was working on a few tests and unfortunately provided some incorrect information. First, the chart* version is 0.21.4, which translates to operator version 1.23.1.

% helm ls -n cnpg-system  
NAME    NAMESPACE       REVISION        UPDATED                                 STATUS          CHART                   APP VERSION
cnpg    cnpg-system     2               2024-06-07 21:32:41.2069064 +0000 UTC   deployed        cloudnative-pg-0.21.4   1.23.1     

Apologies for the confusion.

Also, the testing I performed was with gkeEnvironment: true (i.e. the webhook error message was indeed while gkeEnvironment was set to true, but I copy/pasted later work, unfortunately). I'd changed it to false while trying to test workarounds for the admission webhook failure. Again, my apologies for the confusion.

So with gkeEnvironment: true, would a PR be necessary to remove the secret from being created? I'm more than happy to generate a PR if that is desired.

jsilvela commented 3 weeks ago

Ah, I understand better.

So with gkeEnvironment: true, would a PR be necessary to remove the secret from being created? I'm more than happy to generate a PR if that is desired.

Seems so. Either the secret should not be created or in any case the applicationCredentials field should be left blank, as that is what triggers the validation webhook error.

@itay-grudev

rocket357 commented 3 weeks ago

I updated the mentioned PR with a fix for this as well (the other portion is teaching the chart about walStorage so a separate wal disk can be created via the chart). As mentioned in the PR, if you'd prefer I break those out into two separate PRs, I'm more than happy to do so, however they're fairly trivial so I rolled them in to a single PR to reduce your workload reviewing and testing.