zalando / postgres-operator

Postgres operator creates and manages PostgreSQL clusters running in Kubernetes
https://postgres-operator.readthedocs.io/
MIT License
4.11k stars 949 forks source link

infrastructure_role_secrets field doesn't work as expected. #2644

Closed hemakshis closed 1 month ago

hemakshis commented 1 month ago

Please, answer some short questions which should help us to understand your problem / question better?

I am trying to create two infrastructure roles with different access, Developer (batman - read only) and On-call user (ironman - superuser). Every one should be able to read the Developer k8s secret fetch the password and use it. And only the person who is On-call can read the k8s secret for that (this we can manage by k8s RBAC).

The issue is since I can only provide only one secret as part of the infrastructure_roles_secret_name, I will have to put the other one in infrastructure_roles_secrets but whenever I am trying the following configuration -

Also please note we are using CRD approach.

# read only secret
apiVersion: v1
data:
  # batman
  user1: YmF0bWFu
  # justice
  password1: anVzdGljZQ== 
  # pg_read_all_data
  inrole1: cGdfcmVhZF9hbGxfZGF0YQ==
kind: Secret
metadata:
  name: postgresql-infrastructure-roles
  namespace: postgres-operator
type: Opaque
---
apiVersion: v1
data:
  # ironman: marvel
  ironman: bWFydmVs
kind: Secret
metadata:
  name: postgresql-infrastructure-roles-oncall
  namespace: postgres-operator
type: Opaque
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: postgresql-infrastructure-roles-oncall
  namespace: postgres-operator
data:
  ironman: |
    inrole: [pg_write_all_data]
    user_flags:
      - login
      - superuser
---

In my postgres-operator helm release values.yaml -

apiVersion: v1
kind: Namespace
metadata:
  name: postgres-operator
---
apiVersion: source.toolkit.fluxcd.io/v1beta1
kind: HelmRepository
metadata:
  name: postgres-operator-charts
  namespace: postgres-operator
spec:
  interval: 30m
  url: https://opensource.zalando.com/postgres-operator/charts/postgres-operator
---
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: postgres-operator
  namespace: postgres-operator
spec:
  releaseName: postgres-operator
  chart:
    spec:
      chart: postgres-operator
      version: "1.11.0"
      sourceRef:
        kind: HelmRepository
        name: postgres-operator-charts
        namespace: postgres-operator
  interval: 10m
  values:
    installCRDs: true
    extraArgs: []
    image:
      registry: registry.opensource.zalan.do
      repository: acid/postgres-operator
      tag: v1.11.0
      pullPolicy: "IfNotPresent"
    imagePullSecrets:
    - name: docker-registry-credentials
    configTarget: "OperatorConfigurationCRD"
    ...
    ...
    configKubernetes:
      ...
      ...
      infrastructure_roles_secret_name: postgresql-infrastructure-roles-oncall
      infrastructure_roles_secrets:
        - secretname: "postgresql-infrastructure-roles"
          userkey: "user1"
          passwordkey: "password1"
          rolekey: "inrole1"

My on-call user the one created with infrastructure_roles_secret_name gets created but not the one referred in the infrastructure_roles_secret array.

In the postgres-operator logs I saw the following logs -

...
...
time="2024-05-30T10:57:24Z" level=info msg="   \"InfrastructureRolesSecretName\": \"postgres-operator/postgresql-infrastructure-roles-oncall\"," pkg=controller
time="2024-05-30T10:57:24Z" level=info msg="   \"InfrastructureRoles\": [" pkg=controller
time="2024-05-30T10:57:24Z" level=info msg="      {" pkg=controller
time="2024-05-30T10:57:24Z" level=info msg="         \"SecretName\": \"/\"," pkg=controller
time="2024-05-30T10:57:24Z" level=info msg="         \"UserKey\": \"\"," pkg=controller
time="2024-05-30T10:57:24Z" level=info msg="         \"PasswordKey\": \"\"," pkg=controller
time="2024-05-30T10:57:24Z" level=info msg="         \"RoleKey\": \"\"," pkg=controller
time="2024-05-30T10:57:24Z" level=info msg="         \"DefaultUserValue\": \"\"," pkg=controller
time="2024-05-30T10:57:24Z" level=info msg="         \"DefaultRoleValue\": \"\"," pkg=controller
time="2024-05-30T10:57:24Z" level=info msg="         \"Details\": \"\"," pkg=controller
time="2024-05-30T10:57:24Z" level=info msg="         \"Template\": false" pkg=controller
time="2024-05-30T10:57:24Z" level=info msg="      }" pkg=controller
time="2024-05-30T10:57:24Z" level=info msg="   ]," pkg=controller
...
...
...
time="2024-05-30T10:57:24Z" level=debug msg="cannot get infrastructure role: {SecretName:/ UserKey: PasswordKey: RoleKey: DefaultUserValue: DefaultRoleValue: Details: Template:false}" pkg=controller
time="2024-05-30T10:57:24Z" level=debug msg="found role description for role \"ironman\": &{Origin:unknown Name: Namespace: Password: Flags:[login superuser] MemberOf:[pg_write_all_data] Parameters:map[] AdminRole: IsDbOwner:false Deleted:false Rotated:false}" pkg=controller

After going through the operator code and spending a lot of time understanding how the infrastructures roles are being read, and after spending some time debugging I finally found the issue here -

https://github.com/zalando/postgres-operator/blob/1210ceca72fb017ea72eccd245f5190894ff9ecf/pkg/util/config/config.go#L70-L95

Since the CRD is not following the camelCase structure and no json tags have been provided in the struct above my values were never getting populated from the OperatorConfigurationCRD into the struct and hence always seeing this SecretName: / in the logs.

I can raise a PR if this fix sounds good. I have tested it locally on a kind cluster and it seems to work. After my changes, I finally saw this in the operator logs -

time="2024-05-30T14:19:09Z" level=info msg="   \"InfrastructureRolesSecretName\": \"postgres-operator/postgresql-infrastructure-roles-oncall\"," pkg=controller
time="2024-05-30T14:19:09Z" level=info msg="   \"InfrastructureRoles\": [" pkg=controller
time="2024-05-30T14:19:09Z" level=info msg="      {" pkg=controller
time="2024-05-30T14:19:09Z" level=info msg="         \"secretname\": \"postgres-operator/postgresql-infrastructure-roles\"," pkg=controller
time="2024-05-30T14:19:09Z" level=info msg="         \"userkey\": \"user1\"," pkg=controller
time="2024-05-30T14:19:09Z" level=info msg="         \"passwordkey\": \"password1\"," pkg=controller
time="2024-05-30T14:19:09Z" level=info msg="         \"rolekey\": \"inrole1\"," pkg=controller
time="2024-05-30T14:19:09Z" level=info msg="         \"DefaultUserValue\": \"\"," pkg=controller
time="2024-05-30T14:19:09Z" level=info msg="         \"DefaultRoleValue\": \"\"," pkg=controller
time="2024-05-30T14:19:09Z" level=info msg="         \"Details\": \"\"," pkg=controller
time="2024-05-30T14:19:09Z" level=info msg="         \"Template\": false" pkg=controller
time="2024-05-30T14:19:09Z" level=info msg="      }" pkg=controller
time="2024-05-30T14:19:09Z" level=info msg="   ]," pkg=controller

And my roles and users got created in the postgresql pod as well.