zalando / postgres-operator

Postgres operator creates and manages PostgreSQL clusters running in Kubernetes
https://postgres-operator.readthedocs.io/
MIT License
4.22k stars 968 forks source link

Display the resources created in ArgoCD associated to a CRD instance #1766

Open marcellodesales opened 2 years ago

marcellodesales commented 2 years ago

Please, answer some short questions which should help us to understand your problem / question better?

🐛 Current Implementation

Screen Shot 2022-01-31 at 4 26 22 PM

❓What to display the created resources?

$ kubectl describe postgresqls -n x-aws-sae1-prdt-ppd-dev
Name:         x-postgres-server-aws-sae1-prdt-ppd-dev
Namespace:    x-aws-sae1-prdt-ppd-dev
Labels:       app.kubernetes.io/instance=x-postgres-server-aws-sae1-ppd-dev
              cloud=aws
              env=dev
              product=x
              region=sae1
              segment=ppd
              type=prdt
Annotations:  <none>
API Version:  acid.zalan.do/v1
Kind:         postgresql
Metadata:
  Creation Timestamp:  2022-02-01T00:57:09Z
  Generation:          1
  Managed Fields:
    API Version:  acid.zalan.do/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
    Manager:      argocd-application-controller
    Operation:    Update
    Time:         2022-02-01T00:57:09Z
    API Version:  acid.zalan.do/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        .:
        f:PostgresClusterStatus:
    Manager:         postgres-operator
    Operation:       Update
    Time:            2022-02-01T00:57:09Z
  Resource Version:  1342972
  UID:               72927efa-5dec-4696-8f84-87452c4da221
Spec:
  Databases:
    x:                  y
  Enable Master Load Balancer:  true
  Node Affinity:
    Required During Scheduling Ignored During Execution:
      Node Selector Terms:
        Match Expressions:
          Key:       x.z-worker_group-name
          Operator:  In
          Values:
            system
  Number Of Instances:  2
  Postgresql:
    Version:  14
  Prepared Databases:
    x:
  Resources:
    Limits:
      Cpu:     750m
      Memory:  1Gi
    Requests:
      Cpu:     750m
      Memory:  1Gi
  Team Id:     xy
  Users:
    xz:
      superuser
      createdb
  Volume:
    Size:  10Gi
Status:
  Postgres Cluster Status:  Creating
Events:
  Type    Reason       Age   From               Message
  ----    ------       ----  ----               -------
  Normal  Create       36s   postgres-operator  Started creation of new cluster resources
  Normal  Endpoints    36s   postgres-operator  Endpoint "x-aws-sae1-prdt-ppd-dev/x-postgres-server-aws-sae1-prdt-ppd-dev" has been successfully created
  Normal  Services     36s   postgres-operator  The service "x-aws-sae1-prdt-ppd-dev/x-postgres-server-aws-sae1-prdt-ppd-dev" for role master has been successfully created
  Normal  Services     36s   postgres-operator  The service "x-aws-sae1-prdt-ppd-dev/x-postgres-server-aws-sae1-prdt-ppd-dev-repl" for role replica has been successfully created
  Normal  Secrets      35s   postgres-operator  The secrets have been successfully created
  Normal  StatefulSet  35s   postgres-operator  Statefulset "x-aws-sae1-prdt-ppd-dev/x-postgres-server-aws-sae1-prdt-ppd-dev" has been successfully created
Volumes:
  pgdata:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  pgdata-x-postgres-server-aws-sae1-prdt-ppd-dev-0
    ReadOnly:   false

🎉 What should we see?

Events:
  Type    Reason       Age   From               Message
  ----    ------       ----  ----               -------
  Normal  Create       36s   postgres-operator  Started creation of new cluster resources
  Normal  Endpoints    36s   postgres-operator  Endpoint "x-aws-sae1-prdt-ppd-dev/x-postgres-server-aws-sae1-prdt-ppd-dev" has been successfully created
  Normal  Services     36s   postgres-operator  The service "x-aws-sae1-prdt-ppd-dev/x-postgres-server-aws-sae1-prdt-ppd-dev" for role master has been successfully created
  Normal  Services     36s   postgres-operator  The service "x-aws-sae1-prdt-ppd-dev/x-postgres-server-aws-sae1-prdt-ppd-dev-repl" for role replica has been successfully created
  Normal  Secrets      35s   postgres-operator  The secrets have been successfully created
  Normal  StatefulSet  35s   postgres-operator  Statefulset "x-aws-sae1-prdt-ppd-dev/x-postgres-server-aws-sae1-prdt-ppd-dev" has been successfully created

image

🔧 Configuration Used

SEE: https://argo-cd.readthedocs.io/en/stable/operator-manual/declarative-setup/#applications

FxKu commented 2 years ago

It's not clear to me yet what ArgoCD needs to display child resources? Is it really something the operator can solve (in the CRD yaml itself, in the status subresource)?

marcellodesales commented 2 years ago

@FxKu Digging more on github issues, It looks like it is discussed at an open ticket https://github.com/argoproj/argo-cd/issues/5082 ... t seems like we need to have labels added to the resources for tracking the relationships...

https://argo-cd.readthedocs.io/en/stable/user-guide/resource_tracking/

That would be awesome to be able to visually see those resources created by the operator as we can use additional gitops tools to manage the state of the deployments...

FxKu commented 2 years ago

If it's only about labels you can configure key-value pairs that should be assigned to each resource or that should be inherited from the cluster manifest.

Wikiwix commented 2 years ago

ArgoCD uses the metadata.ownerReference for tracking o "sub-resources".

These are unfortunately not created by the operator as discussed in https://github.com/zalando/postgres-operator/issues/498 .

It's not clear to me yet what ArgoCD needs to display child resources? Is it really something the operator can solve (in the CRD yaml itself, in the status subresource)?

ArgoCD should ideally be aware of all resources running in the cluster if a full GitOps approach is used. That includes "child resources" of the managed resources, because those are part of the application. One instance where this is especially interesting is, seeing the StatefulSet of the PostgreSQL instance with its Pods, logs and PV(C)s. Another instance is alerting on orphaned resources (resources that are not (transitively) represented in Git).

The ownerReferences are exactly for such heritage.

If it's only about labels you can configure key-value pairs that should be assigned to each resource or that should be inherited from the cluster manifest.

Adding labels works currently, but has some downsides which is why we (in our setup) stopped using it. And this will go away eventually with a new tracking method anyway

For reference the necessary settings to at least see the resources in ArgoCD are (Helm values):

inherited_annotations:
  - argocd.argoproj.io/compare-options
  - argocd.argoproj.io/sync-options
inherited_labels:
   - argocd.argoproj.io/instance #This label depends on the ArgoCD installation!

The annotations will then have to be added to all postresql resources, because the created resources will be deleted by ArgoCD otherwise. That in turn will make it impossible to remove a postresql resource via ArgoCD…

sagikazarmark commented 2 years ago

@Wikiwix What kind of issues did you run into with label tracking?

Wikiwix commented 2 years ago

@sagikazarmark

sagikazarmark commented 2 years ago

Thanks @Wikiwix ! I guess the real solution would be adding owner references.

Any update on #498 and whether it would be accepted as a change?

I'd be happy help with the implementation.

bergner commented 1 year ago

Having just tried to get ArgoCD + Zalando Postgres Operator to work smoothly together I've concluded I can't get to a perfect state but fairly close. The key thing is that I want to have the argocd.argoproj.io/compare-options: IgnoreExtraneous annotation on all entities EXCEPT the root Postgresql manifest to ensure ArgoCD does not touch those resources. This is currently impossible to achieve with Zalando Operator without also setting the same label on the Postgresql manifest itself, and doing so causes things to break.

You can get to a somewhat decent state by configuring Postgres operator with:

Configure ArgoCD with:

Then set a suitable argocd.argoproj.io/tracking-id annotation in your created ArgoCD Application manifest, e.g. test-db:acid.zalan.do/postgresql/postgresql:test-ns/test-db and have the same tracking-id in your Postgresql manifest.

This causes all resources to appear in the ArgoCD UI and status is "Healthy", "Synced", "Sync OK". I'm also able to delete the ArgoCD Application and everything seems to clean up properly (possible exception here for a ControllerRevision resource that seems to have lingered around).

These are the potential caveats I see with such a setup:

In cases like this I think it would be immensely helpful to have something like a StatefulSet's .spec.template.metadata.annotations on the Postgresql manifest which then applies to child resources. Right now the Postgres Operator has four options related to annotations and propagation thereof and even when setting all of them the above 5 object types still cannot get any annotations without also setting those annotations on the Postgresql manifest itself.

c0deaddict commented 2 months ago

I've found another workaround, using https://kyverno.io/ Kyverno can patch resources in the cluster, either when they are created or using a background controller on a certain interval (default is 1 hour I believe). I've implemented a background policy that updates the ownerReferences of StatefulSets created by the postgres-operator. When the policy has run ArgoCD correctly displays the link to the statefulset, and from there to the pods :tada:

Here is the ClusterPolicy and the Kyverno helm values that i've used: https://gist.github.com/c0deaddict/79054d2f0b145518d96dfb894a8a2c2c

The same trick can be used to link the other postgres-operator created resources to the postgresql resource. It should also be possible to create a policy that acts upon the creation of the statefulset, but I haven't yet looked into that.