Cannot deploy on on-premise k8s #386

pckroon opened 1 year ago

pckroon commented 1 year ago

Hi all!

I'm running into issues installing the helm chart on my on-prem/local kubernetes cluster. I'll first dump all relevant manifests, and then describe the issue I'm running into.

Helm values ```yaml persistence: enabled: true existingClaim: galaxy-data metrics: enabled: false s3csi: deploy: false cvmfs: deploy: false setupJob: createDatabase: true ttlSecondsAfterFinished: 3600 ingress: enabled: true serviceAccount: create: true postgresql: enabled: true #galaxyDatabasePassword: "potato" #postgresqlPassword: "potato" #postgresqlPostgresPassword: "potato" persistence: enabled: true storageClass: openebs-jiva-csi-sc size: 5Gi refdata: enabled: false influxdb: enabled: false ```
galaxy-data PVC definition ```yaml kind: Namespace apiVersion: v1 metadata: name: galaxy labels: galaxy --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: galaxy-data-nfs namespace: galaxy spec: accessModes: [ "ReadWriteOnce" ] storageClassName: openebs-jiva-csi-sc resources: requests: storage: 50Gi --- kind: Service apiVersion: v1 metadata: name: nfs-server namespace: galaxy spec: clusterIP: ports: - name: nfs port: 2049 - name: mountd port: 20048 - name: rpcbind port: 111 selector: role: nfs-server --- apiVersion: apps/v1 kind: Deployment metadata: name: nfs-server namespace: galaxy spec: replicas: 1 selector: matchLabels: role: nfs-server template: metadata: labels: role: nfs-server spec: securityContext: fsGroup: 101 containers: - name: nfs-server image: ports: - name: nfs containerPort: 2049 - name: mountd containerPort: 20048 - name: rpcbind containerPort: 111 securityContext: privileged: true volumeMounts: - mountPath: /exports name: mypvc volumes: - name: mypvc persistentVolumeClaim: claimName: galaxy-data-nfs --- apiVersion: v1 kind: PersistentVolume metadata: name: galaxy-data namespace: galaxy spec: accessModes: - ReadWriteMany capacity: storage: 50Gi nfs: #server: nfs-server.galaxy.svc.cluster.local server: path: "/" mountOptions: - nfsvers=4.2 --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: galaxy-data namespace: galaxy spec: accessModes: - ReadWriteMany volumeName: galaxy-data resources: requests: storage: 50Gi ```

I use the NFS backed PVC since my "standard" sc (jiva) doesn't support RWX.

Running helm install -n galaxy my-galaxy -f galaxy-values.yaml galaxy-helm/galaxy/ results in

manifest_sorter.go:192: info: skipping unknown hook: "crd-install"
manifest_sorter.go:192: info: skipping unknown hook: "crd-install"
manifest_sorter.go:192: info: skipping unknown hook: "crd-install"
NAME: my-galaxy
LAST DEPLOYED: Fri Oct 14 16:59:25 2022
STATUS: deployed
1. Get the application URL by running these commands:

Looks reasonable to me. However, the job my-galaxy-init-db fails in container galaxy-init-postgres, related to permission denied to chown 101:101 /galaxy/server/database. Easy enough to fix: wipe the namespace, recreate the NFS share, create that folder with correct ownership in the NFS share, and reinstall the helm chart.

Next issue has me stumped though kubectl logs -n galaxy jobs/my-galaxy-init-db-hb4hl -c galaxy-init-postgres:

nc: bad address 'galaxy-my-galaxy-galaxy-postgres'
waiting for galaxy-postgres service

And kubectl logs -n galaxy jobs/my-galaxy-init-db-hb4hl has a python traceback resulting in sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) FATAL: password authentication failed for user "galaxydbuser". On the NFS share, the folder server/database remains empty. The postgres service pod is running:

NAME                                            READY   STATUS      RESTARTS   AGE
pod/galaxy-my-galaxy-galaxy-postgres-0          1/1     Running     0          17m
pod/my-galaxy-galaxy-postgres-7c897bf6c-dd7tb   1/1     Running     0          17m
pod/my-galaxy-init-mounts-monlp-mvwsp           0/4     Completed   0          17m
pod/my-galaxy-job-0-7459f74d56-c6fmb            0/1     Init:0/1    0          17m
pod/my-galaxy-nginx-8474bd9bd-qf58r             1/1     Running     0          17m
pod/my-galaxy-web-dfbddc76b-f6qf7               0/1     Init:0/1    0          17m
pod/my-galaxy-workflow-77fd776ccd-jqtfq         0/1     Init:0/1    0          17m
pod/nfs-server-69cd8d9cb9-tbjs6                 1/1     Running     0          20m

NAME                                              TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)                      AGE
service/galaxy-my-galaxy-galaxy-postgres          ClusterIP    <none>        5432/TCP                     17m
service/galaxy-my-galaxy-galaxy-postgres-config   ClusterIP   None           <none>        <none>                       16m
service/galaxy-my-galaxy-galaxy-postgres-repl     ClusterIP    <none>        5432/TCP                     17m
service/my-galaxy-galaxy-postgres                 ClusterIP   <none>        8080/TCP                     17m
service/my-galaxy-nginx                           ClusterIP   <none>        8000/TCP                     17m
service/my-galaxy-uwsgi                           ClusterIP   <none>        4001/TCP                     17m
service/nfs-server                                ClusterIP      <none>        2049/TCP,20048/TCP,111/TCP   20m

NAME                                        READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/my-galaxy-galaxy-postgres   1/1     1            1           17m
deployment.apps/my-galaxy-job-0             0/1     1            0           17m
deployment.apps/my-galaxy-nginx             1/1     1            1           17m
deployment.apps/my-galaxy-web               0/1     1            0           17m
deployment.apps/my-galaxy-workflow          0/1     1            0           17m
deployment.apps/nfs-server                  1/1     1            1           20m

NAME                                                  DESIRED   CURRENT   READY   AGE
replicaset.apps/my-galaxy-galaxy-postgres-7c897bf6c   1         1         1       17m
replicaset.apps/my-galaxy-job-0-7459f74d56            1         1         0       17m
replicaset.apps/my-galaxy-nginx-8474bd9bd             1         1         1       17m
replicaset.apps/my-galaxy-web-dfbddc76b               1         1         0       17m
replicaset.apps/my-galaxy-workflow-77fd776ccd         1         1         0       17m
replicaset.apps/nfs-server-69cd8d9cb9                 1         1         1       20m

NAME                                                READY   AGE
statefulset.apps/galaxy-my-galaxy-galaxy-postgres   1/1     17m

NAME                                  SCHEDULE    SUSPEND   ACTIVE   LAST SCHEDULE   AGE
cronjob.batch/my-galaxy-maintenance   0 2 * * *   False     0        <none>          17m

NAME                                    COMPLETIONS   DURATION   AGE
job.batch/my-galaxy-init-db-hb4hl       0/1           17m        17m
job.batch/my-galaxy-init-mounts-monlp   1/1           5m25s      17m

NAME                                                            IMAGE                                               CLUSTER-LABEL   SERVICE-ACCOUNT   MIN-INSTANCES   AGE   cluster-name    postgres-pod      -1              17m

NAME                                                        TEAM     VERSION   PODS   VOLUME   CPU-REQUEST   MEMORY-REQUEST   AGE   STATUS   galaxy   13        1      5Gi                                     17m   SyncFailed
kubectl logs -n galaxy pods/my-galaxy-galaxy-postgres-7c897bf6c-dd7tb ``` time="2022-10-14T14:59:34Z" level=info msg="Spilo operator v1.7.0\n" time="2022-10-14T14:59:34Z" level=error msg="could not create customResourceDefinition \"\": \"\" is invalid: spec.versions[0].schema.openAPIV3Schema: Required value: schemas are required" pkg=controller time="2022-10-14T14:59:38Z" level=info msg="Parse role bindings" pkg=controller time="2022-10-14T14:59:38Z" level=info msg="successfully parsed" pkg=controller time="2022-10-14T14:59:38Z" level=info msg="Listening to all namespaces" pkg=controller time="2022-10-14T14:59:38Z" level=info msg="customResourceDefinition \"\" is already registered and will only be updated" pkg=controller time="2022-10-14T14:59:42Z" level=info msg="{" pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"ReadyWaitInterval\": 3000000000," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"ReadyWaitTimeout\": 30000000000," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"ResyncPeriod\": 1800000000000," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"RepairPeriod\": 300000000000," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"EnableCRDValidation\": true," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"ResourceCheckInterval\": 3000000000," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"ResourceCheckTimeout\": 600000000000," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"PodLabelWaitTimeout\": 600000000000," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"PodDeletionWaitTimeout\": 600000000000," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"SpiloRunAsUser\": null," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"SpiloRunAsGroup\": null," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"SpiloFSGroup\": null," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"PodPriorityClassName\": \"\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"ClusterDomain\": \"cluster.local\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"SpiloPrivileged\": false," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"SpiloAllowPrivilegeEscalation\": true," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"AdditionalPodCapabilities\": null," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"ClusterLabels\": {" pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"application\": \"spilo\"" pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" }," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"InheritedLabels\": null," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"InheritedAnnotations\": null," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"DownscalerAnnotations\": null," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"ClusterNameLabel\": \"cluster-name\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"DeleteAnnotationDateKey\": \"\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"DeleteAnnotationNameKey\": \"\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"PodRoleLabel\": \"spilo-role\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"PodToleration\": null," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"DefaultCPURequest\": \"100m\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"DefaultMemoryRequest\": \"100Mi\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"DefaultCPULimit\": \"1\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"DefaultMemoryLimit\": \"500Mi\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"MinCPULimit\": \"250m\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"MinMemoryLimit\": \"250Mi\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"PodEnvironmentConfigMap\": \"/\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"PodEnvironmentSecret\": \"\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"NodeReadinessLabel\": null," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"MaxInstances\": -1," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"MinInstances\": -1," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"ShmVolume\": true," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"SecretNameTemplate\": \"{username}.{cluster}.credentials.{tprkind}.{tprgroup}\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"PamRoleName\": \"zalandos\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"PamConfiguration\": \" uid realm=/employees\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"TeamsAPIUrl\": \"\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"OAuthTokenSecretName\": \"galaxy/my-galaxy-galaxy-postgres\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"InfrastructureRolesSecretName\": \"/\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"InfrastructureRoles\": null," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"InfrastructureRolesDefs\": \"\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"SuperUsername\": \"postgres\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"ReplicationUsername\": \"standby\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"ScalyrAPIKey\": \"\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"ScalyrImage\": \"\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"ScalyrServerURL\": \"\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"ScalyrCPURequest\": \"\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"ScalyrMemoryRequest\": \"\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"ScalyrCPULimit\": \"\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"ScalyrMemoryLimit\": \"\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"LogicalBackupSchedule\": \"30 00 * * *\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"LogicalBackupDockerImage\": \"\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"LogicalBackupProvider\": \"s3\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"LogicalBackupS3Bucket\": \"my-bucket-url\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"LogicalBackupS3Region\": \"\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"LogicalBackupS3Endpoint\": \"\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"LogicalBackupS3AccessKeyID\": \"\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"LogicalBackupS3SecretAccessKey\": \"\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"LogicalBackupS3SSE\": \"AES256\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"LogicalBackupGoogleApplicationCredentials\": \"\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"LogicalBackupJobPrefix\": \"logical-backup-\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"NumberOfInstances\": 2," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"Schema\": \"pooler\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"User\": \"pooler\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"Image\": \"\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"Mode\": \"transaction\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"MaxDBConnections\": 60," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"ConnectionPoolerDefaultCPURequest\": \"500m\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"ConnectionPoolerDefaultMemoryRequest\": \"100Mi\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"ConnectionPoolerDefaultCPULimit\": \"1\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"ConnectionPoolerDefaultMemoryLimit\": \"100Mi\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"WatchedNamespace\": \"\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"KubernetesUseConfigMaps\": false," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"EtcdHost\": \"\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"DockerImage\": \"\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"SidecarImages\": null," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"SidecarContainers\": null," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"PodServiceAccountName\": \"postgres-pod\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"PodServiceAccountDefinition\": \"{\\\"apiVersion\\\":\\\"v1\\\",\\\"kind\\\":\\\"ServiceAccount\\\",\\\"metadata\\\":{\\\"name\\\":\\\"postgres-pod\\\"}}\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"PodServiceAccountRoleBindingDefinition\": \"{\\\"apiVersion\\\":\\\"\\\",\\\"kind\\\":\\\"RoleBinding\\\",\\\"metadata\\\":{\\\"name\\\":\\\"postgres-pod\\\"},\\\"roleRef\\\":{\\\"apiGroup\\\":\\\"\\\",\\\"kind\\\":\\\"ClusterRole\\\",\\\"name\\\":\\\"postgres-pod\\\"},\\\"subjects\\\":[{\\\"kind\\\":\\\"ServiceAccount\\\",\\\"name\\\":\\\"postgres-pod\\\"}]}\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"MasterPodMoveTimeout\": 1200000000000," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"DbHostedZone\": \"\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"AWSRegion\": \"eu-central-1\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"WALES3Bucket\": \"\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"LogS3Bucket\": \"\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"KubeIAMRole\": \"\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"WALGSBucket\": \"\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"GCPCredentials\": \"\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"WALAZStorageAccount\": \"\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"AdditionalSecretMount\": \"\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"AdditionalSecretMountPath\": \"/meta/credentials\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"EnableEBSGp3Migration\": false," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"EnableEBSGp3MigrationMaxSize\": 1000," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"DebugLogging\": true," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"EnableDBAccess\": true," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"EnableTeamsAPI\": false," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"EnableTeamSuperuser\": false," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"TeamAdminRole\": \"admin\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"RoleDeletionSuffix\": \"_deleted\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"EnableTeamMemberDeprecation\": false," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"EnableAdminRoleForUsers\": true," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"EnablePostgresTeamCRD\": false," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"EnablePostgresTeamCRDSuperusers\": false," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"EnableMasterLoadBalancer\": false," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"EnableReplicaLoadBalancer\": false," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"CustomServiceAnnotations\": null," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"CustomPodAnnotations\": null," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"EnablePodAntiAffinity\": false," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"PodAntiAffinityTopologyKey\": \"\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"StorageResizeMode\": \"pvc\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"EnableLoadBalancer\": null," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"ExternalTrafficPolicy\": \"Cluster\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"MasterDNSNameFormat\": \"{cluster}.{team}.{hostedzone}\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"ReplicaDNSNameFormat\": \"{cluster}-repl.{team}.{hostedzone}\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"PDBNameFormat\": \"postgres-{cluster}-pdb\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"EnablePodDisruptionBudget\": true," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"EnableInitContainers\": true," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"EnableSidecars\": true," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"Workers\": 8," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"APIPort\": 8080," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"RingLogLines\": 100," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"ClusterHistoryEntries\": 1000," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"TeamAPIRoleConfiguration\": {" pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"log_statement\": \"all\"" pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" }," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"PodTerminateGracePeriod\": 300000000000," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"PodManagementPolicy\": \"ordered_ready\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"ProtectedRoles\": [" pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"admin\"" pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" ]," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"PostgresSuperuserTeams\": [" pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"postgres_superusers\"" pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" ]," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"SetMemoryRequestToLimit\": false," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"EnableLazySpiloUpgrade\": false," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"EnableCrossNamespaceSecret\": false," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"EnablePgVersionEnvVar\": true," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"EnableSpiloWalPathCompat\": false," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"MajorVersionUpgradeMode\": \"off\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"MinimalMajorVersion\": \"9.5\"," pkg=controller time="2022-10-14T14:59:42Z" level=info msg=" \"TargetMajorVersion\": \"13\"" pkg=controller time="2022-10-14T14:59:42Z" level=info msg="}" pkg=controller time="2022-10-14T14:59:42Z" level=debug msg="acquiring initial list of clusters" pkg=controller time="2022-10-14T14:59:42Z" level=debug msg="added new cluster: \"galaxy/galaxy-my-galaxy-galaxy-postgres\"" pkg=controller time="2022-10-14T14:59:42Z" level=info msg="SYNC event has been queued" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=controller worker=0 time="2022-10-14T14:59:42Z" level=info msg="there are 1 clusters running" pkg=controller time="2022-10-14T14:59:42Z" level=info msg="started working in background" pkg=controller time="2022-10-14T14:59:42Z" level=info msg="listening on :8080" pkg=apiserver time="2022-10-14T14:59:42Z" level=info msg="ADD event has been queued" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=controller worker=0 time="2022-10-14T14:59:42Z" level=info msg="creating pod service account \"postgres-pod\" in the \"galaxy\" namespace" pkg=controller time="2022-10-14T14:59:42Z" level=info msg="successfully deployed the pod service account \"postgres-pod\" to the \"galaxy\" namespace" pkg=controller time="2022-10-14T14:59:42Z" level=debug msg="new node has been added: / ()" pkg=controller time="2022-10-14T14:59:42Z" level=debug msg="new node has been added: /rohan2013 ()" pkg=controller time="2022-10-14T14:59:42Z" level=info msg="Creating the role binding \"postgres-pod\" in the \"galaxy\" namespace" pkg=controller time="2022-10-14T14:59:42Z" level=info msg="successfully deployed the role binding for the pod service account \"postgres-pod\" to the \"galaxy\" namespace" pkg=controller time="2022-10-14T14:59:42Z" level=info msg="syncing of the cluster started" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=controller worker=0 time="2022-10-14T14:59:42Z" level=debug msg="team API is disabled" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=cluster time="2022-10-14T14:59:42Z" level=debug msg="team API is disabled" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=cluster time="2022-10-14T14:59:42Z" level=info msg="syncing secrets" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=cluster time="2022-10-14T14:59:42Z" level=debug msg="secret galaxy/ already exists, fetching its password" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=cluster time="2022-10-14T14:59:42Z" level=debug msg="secret galaxy/ already exists, fetching its password" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=cluster time="2022-10-14T14:59:42Z" level=debug msg="created new secret galaxy/, namespace: galaxy, uid: df0ac207-f80e-45e5-8802-b98462095cf3" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=cluster time="2022-10-14T14:59:42Z" level=debug msg="syncing master service" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=cluster time="2022-10-14T14:59:42Z" level=info msg="could not find the cluster's master endpoint" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=cluster time="2022-10-14T14:59:42Z" level=warning msg="master is not running, generated master endpoint does not contain any addresses" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=cluster time="2022-10-14T14:59:42Z" level=info msg="created missing master endpoint \"galaxy/galaxy-my-galaxy-galaxy-postgres\"" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=cluster time="2022-10-14T14:59:42Z" level=info msg="could not find the cluster's master service" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=cluster time="2022-10-14T14:59:43Z" level=info msg="created missing master service \"galaxy/galaxy-my-galaxy-galaxy-postgres\"" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=cluster time="2022-10-14T14:59:43Z" level=debug msg="syncing replica service" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=cluster time="2022-10-14T14:59:43Z" level=info msg="could not find the cluster's replica endpoint" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=cluster time="2022-10-14T14:59:43Z" level=info msg="created missing replica endpoint \"galaxy/galaxy-my-galaxy-galaxy-postgres-repl\"" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=cluster time="2022-10-14T14:59:43Z" level=info msg="could not find the cluster's replica service" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=cluster time="2022-10-14T14:59:43Z" level=debug msg="No load balancer created for the replica service" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=cluster time="2022-10-14T14:59:44Z" level=info msg="created missing replica service \"galaxy/galaxy-my-galaxy-galaxy-postgres-repl\"" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=cluster time="2022-10-14T14:59:44Z" level=debug msg="syncing volumes using \"pvc\" storage resize mode" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=cluster time="2022-10-14T14:59:44Z" level=info msg="volume claims do not require changes" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=cluster time="2022-10-14T14:59:44Z" level=debug msg="syncing statefulsets" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=cluster time="2022-10-14T14:59:44Z" level=info msg="cluster's statefulset does not exist" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=cluster time="2022-10-14T14:59:44Z" level=debug msg="created new statefulset \"galaxy/galaxy-my-galaxy-galaxy-postgres\", uid: \"eb26faf0-1c3f-44bb-8af7-dbe2406a9368\"" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=cluster time="2022-10-14T14:59:47Z" level=debug msg="Waiting for 1 pods to become ready" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=cluster time="2022-10-14T15:00:11Z" level=info msg="created missing statefulset \"galaxy/galaxy-my-galaxy-galaxy-postgres\"" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=cluster time="2022-10-14T15:00:11Z" level=debug msg="making GET http request:" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=cluster time="2022-10-14T15:00:11Z" level=debug msg="patching Postgres config via Patroni API on pod galaxy/galaxy-my-galaxy-galaxy-postgres-0 with following options: {\"synchronous_mode\":false,\"synchronous_mode_strict\":false}" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=cluster time="2022-10-14T15:00:11Z" level=debug msg="making PATCH http request:" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=cluster time="2022-10-14T15:00:11Z" level=debug msg="restarting Postgres server within pods" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=cluster time="2022-10-14T15:00:11Z" level=debug msg="making GET http request:" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=cluster time="2022-10-14T15:00:11Z" level=debug msg="Postgres server successfuly restarted in master pod galaxy/galaxy-my-galaxy-galaxy-postgres-0" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=cluster time="2022-10-14T15:00:11Z" level=debug msg="syncing pod disruption budgets" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=cluster time="2022-10-14T15:00:11Z" level=info msg="could not find the cluster's pod disruption budget" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=cluster time="2022-10-14T15:00:11Z" level=warning msg="error while syncing cluster state: could not sync pod disruption budget: could not create pod disruption budget: the server could not find the requested resource" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=cluster time="2022-10-14T15:00:11Z" level=error msg="could not sync cluster: could not sync pod disruption budget: could not create pod disruption budget: the server could not find the requested resource" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=controller worker=0 time="2022-10-14T15:00:11Z" level=info msg="recieved add event for already existing Postgres cluster" cluster-name=galaxy/galaxy-my-galaxy-galaxy-postgres pkg=controller worker=0 ```

Any help would be much appreciated :)

nuwang commented 1 year ago

Are you using k8s 1.25 by chance? Looks like there's an issue with it: ping @afgane

pckroon commented 1 year ago

Thanks for the quick reply. I'm indeed running k8s 1.25.1:

Client Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.1", GitCommit:"e4d4e1ab7cf1bf15273ef97303551b279f0920a9", GitTreeState:"clean", BuildDate:"2022-09-14T19:49:27Z", GoVersion:"go1.19.1", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.1", GitCommit:"e4d4e1ab7cf1bf15273ef97303551b279f0920a9", GitTreeState:"clean", BuildDate:"2022-09-14T19:42:30Z", GoVersion:"go1.19.1", Compiler:"gc", Platform:"linux/amd64"}

And in case it matters, the helm version:

version.BuildInfo{Version:"v3.10.0", GitCommit:"ce66412a723e4d89555dc67217607c6579ffcb21", GitTreeState:"clean", GoVersion:"go1.18.6"}

It looks like it's exactly the issue described in #385. Do you have any idea about the release cadence of the pgsql operator? Does it make sense to report this upstream (again)?

nuwang commented 1 year ago

There doesn't appear to be a fixed release cadence, but support for 1.25 appears to have been added: