Closed gilfrade closed 3 years ago
I did not reproduce this. It sounds like something in your pgo-config
ConfigMap is incorrect. Did you install based on an upgrade or customize your templates within pgo-config
?
I used the one in installers/kubectl
and modified some values. Can you check it?
apiVersion: v1
kind: ServiceAccount
metadata:
name: pgo-deployer-sa
namespace: pgo
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: pgo-deployer-cr
rules:
- apiGroups:
- ''
resources:
- namespaces
verbs:
- get
- list
- create
- patch
- delete
- apiGroups:
- ''
resources:
- pods
verbs:
- list
- apiGroups:
- ''
resources:
- secrets
verbs:
- list
- get
- create
- delete
- apiGroups:
- ''
resources:
- configmaps
- services
- persistentvolumeclaims
verbs:
- get
- create
- delete
- list
- apiGroups:
- ''
resources:
- serviceaccounts
verbs:
- get
- create
- delete
- patch
- list
- apiGroups:
- apps
- extensions
resources:
- deployments
- replicasets
verbs:
- get
- list
- watch
- create
- delete
- apiGroups:
- apiextensions.k8s.io
resources:
- customresourcedefinitions
verbs:
- get
- create
- delete
- apiGroups:
- rbac.authorization.k8s.io
resources:
- clusterroles
- clusterrolebindings
- roles
- rolebindings
verbs:
- get
- create
- delete
- bind
- escalate
- apiGroups:
- rbac.authorization.k8s.io
resources:
- roles
verbs:
- create
- delete
- apiGroups:
- batch
resources:
- jobs
verbs:
- delete
- list
- apiGroups:
- crunchydata.com
resources:
- pgclusters
- pgreplicas
- pgpolicies
- pgtasks
verbs:
- delete
- list
---
apiVersion: v1
kind: ConfigMap
metadata:
name: pgo-deployer-cm
namespace: pgo
data:
values.yaml: |-
# =====================
# Configuration Options
# More info for these options can be found in the docs
# https://access.crunchydata.com/documentation/postgres-operator/latest/installation/configuration/
# =====================
archive_mode: "true"
archive_timeout: "60"
backrest_aws_s3_bucket: "postgres"
backrest_aws_s3_endpoint: "minio.local:9000"
backrest_aws_s3_key: "testuser"
backrest_aws_s3_region: "eu-west-2"
backrest_aws_s3_secret: "12431234"
backrest_aws_s3_uri_style: ""
backrest_aws_s3_verify_tls: "false"
backrest_port: "2022"
badger: "false"
ccp_image_prefix: "registry.developers.crunchydata.com/crunchydata"
ccp_image_pull_secret: ""
ccp_image_pull_secret_manifest: ""
ccp_image_tag: "centos8-13.1-4.6.0"
create_rbac: "true"
crunchy_debug: "false"
db_name: "pgo"
db_password_age_days: "0"
db_password_length: "24"
db_port: "5432"
db_replicas: "0"
db_user: "testuser"
default_instance_memory: "1248Mi"
default_pgbackrest_memory: "128Mi"
default_pgbouncer_memory: "128Mi"
default_exporter_memory: "24Mi"
delete_operator_namespace: "false"
delete_watched_namespaces: "false"
disable_auto_failover: "false"
disable_fsgroup: "true"
reconcile_rbac: "true"
exporterport: "9187"
metrics: "true"
namespace: "pgo"
namespace_mode: "dynamic"
pgbadgerport: "10000"
pgo_add_os_ca_store: "false"
pgo_admin_password: "examplepassword"
pgo_admin_perms: "*"
pgo_admin_role_name: "pgoadmin"
pgo_admin_username: "admin"
pgo_apiserver_port: "8443"
pgo_apiserver_url: "https://postgres-operator"
pgo_client_cert_secret: "pgo.tls"
pgo_client_container_install: "false"
pgo_client_install: "true"
pgo_client_version: "4.6.0"
pgo_cluster_admin: "false"
pgo_disable_eventing: "false"
pgo_disable_tls: "false"
pgo_image_prefix: "registry.developers.crunchydata.com/crunchydata"
pgo_image_pull_secret: ""
pgo_image_pull_secret_manifest: ""
pgo_image_tag: "centos8-4.6.0"
pgo_installation_name: "devtest"
pgo_noauth_routes: ""
pgo_operator_namespace: "pgo"
pgo_tls_ca_store: ""
pgo_tls_no_verify: "false"
pod_anti_affinity: "preferred"
pod_anti_affinity_pgbackrest: ""
pod_anti_affinity_pgbouncer: ""
scheduler_timeout: "3600"
service_type: "ClusterIP"
sync_replication: "false"
backrest_storage: "s3"
backup_storage: "s3"
primary_storage: "rook"
replica_storage: "rook"
wal_storage: ""
# storage1_name: "default"
# storage1_access_mode: "ReadWriteOnce"
# storage1_size: "10G"
# storage1_type: "dynamic"
# storage2_name: "hostpathstorage"
# storage2_access_mode: "ReadWriteMany"
# storage2_size: "10G"
# storage2_type: "create"
# storage3_name: "nfsstorage"
# storage3_access_mode: "ReadWriteMany"
# storage3_size: "1G"
# storage3_type: "create"
# storage3_supplemental_groups: "65534"
# storage4_name: "nfsstoragered"
# storage4_access_mode: "ReadWriteMany"
# storage4_size: "1G"
# storage4_match_labels: "crunchyzone=red"
# storage4_type: "create"
# storage4_supplemental_groups: "65534"
# storage5_name: "storageos"
# storage5_access_mode: "ReadWriteOnce"
# storage5_size: "5Gi"
# storage5_type: "dynamic"
# storage5_class: "fast"
# storage6_name: "primarysite"
# storage6_access_mode: "ReadWriteOnce"
# storage6_size: "4G"
# storage6_type: "dynamic"
# storage6_class: "primarysite"
# storage7_name: "alternatesite"
# storage7_access_mode: "ReadWriteOnce"
# storage7_size: "4G"
# storage7_type: "dynamic"
# storage7_class: "alternatesite"
# storage8_name: "gce"
# storage8_access_mode: "ReadWriteOnce"
# storage8_size: "300M"
# storage8_type: "dynamic"
# storage8_class: "standard"
storage9_name: "rook"
storage9_access_mode: "ReadWriteOnce"
storage9_size: "10Gi"
storage9_type: "dynamic"
storage9_class: "rook-ceph-block"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: pgo-deployer-crb
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: pgo-deployer-cr
subjects:
- kind: ServiceAccount
name: pgo-deployer-sa
namespace: pgo
---
apiVersion: batch/v1
kind: Job
metadata:
name: pgo-deploy
namespace: pgo
spec:
backoffLimit: 0
template:
metadata:
name: pgo-deploy
spec:
serviceAccountName: pgo-deployer-sa
restartPolicy: Never
containers:
- name: pgo-deploy
image: registry.developers.crunchydata.com/crunchydata/pgo-deployer:centos8-4.6.0
imagePullPolicy: IfNotPresent
env:
- name: DEPLOY_ACTION
value: install
volumeMounts:
- name: deployer-conf
mountPath: "/conf"
volumes:
- name: deployer-conf
configMap:
name: pgo-deployer-cm
If I had to take a guess, it would be this block:
# storage1_name: "default"
# storage1_access_mode: "ReadWriteOnce"
# storage1_size: "10G"
# storage1_type: "dynamic"
# storage2_name: "hostpathstorage"
# storage2_access_mode: "ReadWriteMany"
# storage2_size: "10G"
# storage2_type: "create"
# storage3_name: "nfsstorage"
# storage3_access_mode: "ReadWriteMany"
# storage3_size: "1G"
# storage3_type: "create"
# storage3_supplemental_groups: "65534"
# storage4_name: "nfsstoragered"
# storage4_access_mode: "ReadWriteMany"
# storage4_size: "1G"
# storage4_match_labels: "crunchyzone=red"
# storage4_type: "create"
# storage4_supplemental_groups: "65534"
# storage5_name: "storageos"
# storage5_access_mode: "ReadWriteOnce"
# storage5_size: "5Gi"
# storage5_type: "dynamic"
# storage5_class: "fast"
# storage6_name: "primarysite"
# storage6_access_mode: "ReadWriteOnce"
# storage6_size: "4G"
# storage6_type: "dynamic"
# storage6_class: "primarysite"
# storage7_name: "alternatesite"
# storage7_access_mode: "ReadWriteOnce"
# storage7_size: "4G"
# storage7_type: "dynamic"
# storage7_class: "alternatesite"
# storage8_name: "gce"
# storage8_access_mode: "ReadWriteOnce"
# storage8_size: "300M"
# storage8_type: "dynamic"
# storage8_class: "standard"
storage9_name: "rook"
storage9_access_mode: "ReadWriteOnce"
storage9_size: "10Gi"
storage9_type: "dynamic"
storage9_class: "rook-ceph-block"
If you want to use the other storage configs, you should either:
rook-ceph-block
info to storage1_*
and comment out the rest.If rook-ceph-block
is the default storage class in your system, you can just use the default
storage class.
With:
backrest_storage: "default"
backup_storage: "default"
primary_storage: "default"
replica_storage: "default"
Tested using rook-ceph-block
as a default pv and failed the same way.
Tested using rook-ceph-block
as storage1_*
with every storage_*
commented and failed the same way.
I have not seen the storage*
commented out like that so I don't know if that is it.
There is likely a syntax error in your cluster-bootstrap-job.json
entry in the pgo-config
ConfigMap. I would recommend check that.
This did indirectly reveal the need for 091ddd74b1c3f2977d9bf042a056295443df246f which would help in determining if the cluster-bootstrap-job.json
is broken, though I still do not see anything here yet that indicates an actual bug.
Is there a way to increase the log level in postgres-operator so we can have more information?
I did not find any syntax error when compared to original files in repo installer/kubectl
and the only other file i added was zzz-tune-config
and i see that database reflects the changes from that config so i assume its correct.
I activated crunchy_debug: "true"
in operator and it is showing more information but it seems its missing part of it. This time i used the helm installer for the operator and i just created a simple cluster without S3.
helm values:
---
# ======================
# Installer Controls
# ======================
fullnameOverride: ""
# rbac: settings for deployer RBAC creation
rbac:
# rbac.create: if false RBAC resources should be in place
create: true
# rbac.useClusterAdmin: creates a ClusterRoleBinding giving cluster-admin to serviceAccount.name
useClusterAdmin: false
# serviceAccount: settings for Service Account used by the deployer
serviceAccount:
# serviceAccount.create: Whether to create a Service Account or not
create: true
# serviceAccount.name: The name of the Service Account to create or use
name: ""
# =====================
# Configuration Options
# More info for these options can be found in the docs
# https://access.crunchydata.com/documentation/postgres-operator/latest/installation/configuration/
# =====================
archive_mode: "true"
archive_timeout: "60"
backrest_aws_s3_bucket: "postgres"
backrest_aws_s3_endpoint: "minio.local:9000"
backrest_aws_s3_key: "psqlbk"
backrest_aws_s3_region: "eu-west-2"
backrest_aws_s3_secret: "12431234"
backrest_aws_s3_uri_style: ""
backrest_aws_s3_verify_tls: "false"
backrest_port: "2022"
badger: "false"
ccp_image_prefix: "registry.developers.crunchydata.com/crunchydata"
ccp_image_pull_secret: ""
ccp_image_pull_secret_manifest: ""
ccp_image_tag: "centos8-12.5-4.6.0"
create_rbac: "true"
crunchy_debug: "true"
db_name: "pgo"
db_password_age_days: "0"
db_password_length: "24"
db_port: "5432"
db_replicas: "0"
db_user: "testuser"
default_instance_memory: "1248Mi"
default_pgbackrest_memory: "128Mi"
default_pgbouncer_memory: "128Mi"
default_exporter_memory: "24Mi"
delete_operator_namespace: "false"
delete_watched_namespaces: "false"
disable_auto_failover: "false"
disable_fsgroup: "true"
reconcile_rbac: "true"
exporterport: "9187"
metrics: "true"
namespace: "pgo"
namespace_mode: "dynamic"
pgbadgerport: "10000"
pgo_add_os_ca_store: "false"
pgo_admin_password: "examplepassword"
pgo_admin_perms: "*"
pgo_admin_role_name: "pgoadmin"
pgo_admin_username: "admin"
pgo_apiserver_port: "8443"
pgo_apiserver_url: "https://postgres-operator"
pgo_client_cert_secret: "pgo.tls"
pgo_client_container_install: "false"
pgo_client_install: "true"
pgo_client_version: "4.6.0"
pgo_cluster_admin: "false"
pgo_disable_eventing: "false"
pgo_disable_tls: "false"
pgo_image_prefix: "registry.developers.crunchydata.com/crunchydata"
pgo_image_pull_secret: ""
pgo_image_pull_secret_manifest: ""
pgo_image_tag: "centos8-4.6.0"
pgo_installation_name: "devtest"
pgo_noauth_routes: ""
pgo_operator_namespace: "pgo"
pgo_tls_ca_store: ""
pgo_tls_no_verify: "false"
pod_anti_affinity: "preferred"
pod_anti_affinity_pgbackrest: ""
pod_anti_affinity_pgbouncer: ""
scheduler_timeout: "3600"
service_type: "ClusterIP"
sync_replication: "false"
backrest_storage: "default"
backup_storage: "default"
primary_storage: "default"
replica_storage: "default"
wal_storage: ""
storage1_name: "default"
storage1_access_mode: "ReadWriteOnce"
storage1_size: "10Gi"
storage1_type: "dynamic"
storage1_class: "rook-ceph-block"
Created a cluster:
pgo create cluster zzz --replica-count=0 --node-label nodetype=testing --cpu=10.0 --cpu-limit=10.0 --memory=1248Mi --memory-limit=1248Mi --password-superuser=xpt0! --password-replication=xpt0! --password=xpt0! --metrics -n pgo
Made a new backup and tried to restore:
pgo backup zzz
pgo restore zzz
This is the log part when it creates a new cluster:
time="2021-02-03T10:50:32Z" level=info msg="creating Pgcluster zzz in namespace pgo" func="internal/operator/cluster.getClusterDeploymentFields()" file="internal/operator/cluster/clusterlogic.go:263" version=4.6.0
time="2021-02-03T10:50:32Z" level=debug msg="creating exporter secret for cluster zzz" func="internal/operator/cluster.getClusterDeploymentFields()" file="internal/operator/cluster/clusterlogic.go:289" version=4.6.0
time="2021-02-03T10:50:32Z" level=debug msg="[123 125]" func="internal/operator.GetPodSecurityContext()" file="internal/operator/common.go:188" version=4.6.0
time="2021-02-03T10:50:32Z" level=debug msg="GetPodAnitAffinity with clusterName=[zzz]" func="internal/operator.GetPodAntiAffinity()" file="internal/operator/clusterutilities.go:645" version=4.6.0
"podAntiAffinity": {
"preferredDuringSchedulingIgnoredDuringExecution": [
{
"weight": 1,
"podAffinityTerm": {
"labelSelector": {
"matchExpressions": [
{
"key": "vendor",
"operator": "In",
"values": [
"crunchydata"
]
},
{
"key": "pg-pod-anti-affinity",
"operator": "Exists"
},
{
"key": "pg-cluster",
"operator": "In",
"values": [
"zzz"
]
}
]
},
"topologyKey": "kubernetes.io/hostname"
}
}
]
}
"resources": {
"limits": {
"cpu": "10",
"memory": "1248Mi"
},
"requests": {
"cpu": "10",
"memory": "1248Mi"
}
},
time="2021-02-03T10:50:32Z" level=debug msg="pgo-custom-pg-config was not found, skipping global configMap" func="internal/operator.GetConfVolume()" file="internal/operator/clusterutilities.go:411" version=4.6.0
"resources": {
"requests": {
"memory": "24Mi"
}
},
{
"name": "exporter",
"image": "registry.developers.crunchydata.com/crunchydata/crunchy-postgres-exporter:centos8-4.6.0",
"ports": [{
"containerPort": 9187,
"protocol": "TCP"
}],
"resources": {
"requests": {
"memory": "24Mi"
}
},
"env": [
{
"name": "EXPORTER_PG_HOST",
"value": "127.0.0.1"
},
{
"name": "EXPORTER_PG_PORT",
"value": "5432"
},
{
"name": "EXPORTER_PG_DATABASE",
"value": "postgres"
},
{
"name": "EXPORTER_PG_PARAMS",
"value": "sslmode=disable"
},
{
"name": "JOB_NAME",
"value": "zzz"
},
{
"name": "POSTGRES_EXPORTER_PORT",
"value": "9187"
},
{
"name": "EXPORTER_PG_USER",
"valueFrom": {
"secretKeyRef": {
"name": "zzz-exporter-secret",
"key": "username"
}
}
},
{
"name": "EXPORTER_PG_PASSWORD",
"valueFrom": {
"secretKeyRef": {
"name": "zzz-exporter-secret",
"key": "password"
}
}
}
]
}
{
"kind": "Deployment",
"apiVersion": "apps/v1",
"metadata": {
"name": "zzz",
"labels": {
"vendor": "crunchydata",
"pgo-pg-database": "true",
"deployment-name": "zzz","pgouser": "admin","crunchy-pgha-scope": "zzz","crunchy-postgres-exporter": "true","pgo-version": "4.6.0","workflowid": "105518a6-57ca-40cd-abca-b6fc250ae247","name": "zzz","pg-cluster": "zzz"
}
},
"spec": {
"replicas": 0,
"selector": {
"matchLabels": {
"vendor": "crunchydata",
"pg-cluster": "zzz",
"pgo-pg-database": "true",
"deployment-name": "zzz"
}
},
"template": {
"metadata": {
"labels": {
"name": "zzz",
"vendor": "crunchydata",
"pgo-pg-database": "true",
"pg-pod-anti-affinity": "preferred",
"deployment-name": "zzz","pgouser": "admin","crunchy-pgha-scope": "zzz","crunchy-postgres-exporter": "true","pgo-version": "4.6.0","workflowid": "105518a6-57ca-40cd-abca-b6fc250ae247","name": "zzz","pg-cluster": "zzz"
}
},
"spec": {
"securityContext": {},
"serviceAccountName": "pgo-pg",
"containers": [
{
"name": "database",
"image": "registry.developers.crunchydata.com/crunchydata/crunchy-postgres-ha:centos8-12.5-4.6.0",
"readinessProbe": {
"exec": {
"command": [
"/opt/crunchy/bin/postgres-ha/health/pgha-readiness.sh"
]
},
"initialDelaySeconds": 15
},
"livenessProbe": {
"exec": {
"command": [
"/opt/crunchy/bin/postgres-ha/health/pgha-liveness.sh"
]
},
"initialDelaySeconds": 30,
"periodSeconds": 15,
"timeoutSeconds": 10
},
"resources": {
"limits": {
"cpu": "10",
"memory": "1248Mi"
},
"requests": {
"cpu": "10",
"memory": "1248Mi"
}
},
"env": [{
"name": "MODE",
"value": "postgres"
},
{
"name": "PGHA_PG_PORT",
"value": "5432"
}, {
"name": "PGHA_USER",
"value": "postgres"
},
{
"name": "PGHA_INIT",
"valueFrom": {
"configMapKeyRef": {
"name": "zzz-pgha-config",
"key": "init"
}
}
},
{
"name": "PATRONI_POSTGRESQL_DATA_DIR",
"value": "/pgdata/zzz"
},
{
"name": "PGBACKREST_STANZA",
"value": "db"
},
{
"name": "PGBACKREST_REPO1_HOST",
"value": "zzz-backrest-shared-repo"
},
{
"name": "BACKREST_SKIP_CREATE_STANZA",
"value": "true"
},
{
"name": "PGHA_PGBACKREST",
"value": "true"
},
{
"name": "PGBACKREST_REPO1_PATH",
"value": "/backrestrepo/zzz-backrest-shared-repo"
},
{
"name": "PGBACKREST_DB_PATH",
"value": "/pgdata/zzz"
},
{
"name": "ENABLE_SSHD",
"value": "true"
},
{
"name": "PGBACKREST_LOG_PATH",
"value": "/tmp"
},
{
"name": "PGBACKREST_PG1_SOCKET_PATH",
"value": "/tmp"
},
{
"name": "PGBACKREST_PG1_PORT",
"value": "5432"
},
{
"name": "PGBACKREST_REPO1_TYPE",
"value": "posix"
},
{
"name": "PGHA_PGBACKREST_LOCAL_S3_STORAGE",
"value": "false"
},
{
"name": "PGMONITOR_PASSWORD",
"valueFrom": {
"secretKeyRef": {
"name": "zzz-exporter-secret",
"key": "password"
}
}
},
{
"name": "PGHA_DATABASE",
"value": "pgo"
}, {
"name": "PGHA_REPLICA_REINIT_ON_START_FAIL",
"value": "true"
}, {
"name": "PGHA_SYNC_REPLICATION",
"value": "false"
}, {
"name": "PGHA_TLS_ENABLED",
"value": "false"
}, {
"name": "PGHA_TLS_ONLY",
"value": "false"
}, {
"name": "PGHA_STANDBY",
"value": "false"
}, {
"name": "PATRONI_KUBERNETES_NAMESPACE",
"valueFrom": {
"fieldRef": {
"fieldPath": "metadata.namespace"
}
}
}, {
"name": "PATRONI_KUBERNETES_SCOPE_LABEL",
"value": "crunchy-pgha-scope"
}, {
"name": "PATRONI_SCOPE",
"valueFrom": {
"fieldRef": {
"fieldPath": "metadata.labels['crunchy-pgha-scope']"
}
}
}, {
"name": "PATRONI_KUBERNETES_LABELS",
"value": "{vendor: \"crunchydata\"}"
}, {
"name": "PATRONI_LOG_LEVEL",
"value": "INFO"
}, {
"name": "PGHOST",
"value": "/tmp"
}],
"volumeMounts": [{
"mountPath": "/pgdata",
"name": "pgdata",
"readOnly": false
}, {
"mountPath": "/pgconf/pguser",
"name": "user-volume"
}, {
"mountPath": "/pgconf/pgreplicator",
"name": "primary-volume"
}, {
"mountPath": "/pgconf/pgsuper",
"name": "root-volume"
},
{
"mountPath": "/sshd",
"name": "sshd",
"readOnly": true
}, {
"mountPath": "/pgconf",
"name": "pgconf-volume"
},
{
"mountPath": "/dev/shm",
"name": "dshm"
},
{
"mountPath": "/etc/pgbackrest/conf.d",
"name": "pgbackrest-config"
},
{
"mountPath": "/etc/podinfo",
"name": "podinfo"
}
],
"ports": [{
"containerPort": 5432,
"protocol": "TCP"
}, {
"containerPort": 8009,
"protocol": "TCP"
}],
"imagePullPolicy": "IfNotPresent"
}
,{
"name": "exporter",
"image": "registry.developers.crunchydata.com/crunchydata/crunchy-postgres-exporter:centos8-4.6.0",
"ports": [{
"containerPort": 9187,
"protocol": "TCP"
}],
"resources": {
"requests": {
"memory": "24Mi"
}
},
"env": [
{
"name": "EXPORTER_PG_HOST",
"value": "127.0.0.1"
},
{
"name": "EXPORTER_PG_PORT",
"value": "5432"
},
{
"name": "EXPORTER_PG_DATABASE",
"value": "postgres"
},
{
"name": "EXPORTER_PG_PARAMS",
"value": "sslmode=disable"
},
{
"name": "JOB_NAME",
"value": "zzz"
},
{
"name": "POSTGRES_EXPORTER_PORT",
"value": "9187"
},
{
"name": "EXPORTER_PG_USER",
"valueFrom": {
"secretKeyRef": {
"name": "zzz-exporter-secret",
"key": "username"
}
}
},
{
"name": "EXPORTER_PG_PASSWORD",
"valueFrom": {
"secretKeyRef": {
"name": "zzz-exporter-secret",
"key": "password"
}
}
}
]
}
],
"volumes": [{
"name": "pgdata",
"persistentVolumeClaim":{"claimName":"zzz"}
}, {
"name": "user-volume",
"secret": {
"secretName": "zzz-testuser-secret"
}
}, {
"name": "primary-volume",
"secret": {
"secretName": "zzz-primaryuser-secret"
}
}, {
"name": "sshd",
"secret": {
"secretName": "zzz-backrest-repo-config"
}
}, {
"name": "root-volume",
"secret": {
"secretName": "zzz-postgres-secret"
}
},
{
"name": "report",
"emptyDir": {
"medium": "Memory",
"sizeLimit": "64Mi"
}
},
{
"name": "dshm",
"emptyDir": {
"medium": "Memory"
}
},
{
"name": "pgbackrest-config",
"projected": { "sources": [] }
},
{
"name": "pgconf-volume",
"projected": {
"sources": [
{
"configMap": {
"name": "zzz-pgha-config",
"optional": true
}
}
]
}
},
{
"name": "podinfo",
"downwardAPI": {
"defaultMode": 420,
"items": [
{
"path": "cpu_limit",
"resourceFieldRef": {
"containerName": "database",
"divisor": "1m",
"resource": "limits.cpu"
}
},
{
"path": "cpu_request",
"resourceFieldRef": {
"containerName": "database",
"divisor": "1m",
"resource": "requests.cpu"
}
},
{
"path": "mem_limit",
"resourceFieldRef": {
"containerName": "database",
"resource": "limits.memory"
}
},
{
"path": "mem_request",
"resourceFieldRef": {
"containerName": "database",
"resource": "requests.memory"
}
},
{
"fieldRef": {
"apiVersion": "v1",
"fieldPath": "metadata.labels"
},
"path": "labels"
},
{
"fieldRef": {
"apiVersion": "v1",
"fieldPath": "metadata.annotations"
},
"path": "annotations"
}
]
}
}
],
"affinity": {
"nodeAffinity": {
"preferredDuringSchedulingIgnoredDuringExecution": [
{
"weight": 10,
"preference": {
"matchExpressions": [
{
"key": "nodetype",
"operator": "In",
"values": [
"production"
]
}
]
}
}
]
}
,
"podAntiAffinity": {
"preferredDuringSchedulingIgnoredDuringExecution": [
{
"weight": 1,
"podAffinityTerm": {
"labelSelector": {
"matchExpressions": [
{
"key": "vendor",
"operator": "In",
"values": [
"crunchydata"
]
},
{
"key": "pg-pod-anti-affinity",
"operator": "Exists"
},
{
"key": "pg-cluster",
"operator": "In",
"values": [
"zzz"
]
}
]
},
"topologyKey": "kubernetes.io/hostname"
}
}
]
}
},
"restartPolicy": "Always",
"dnsPolicy": "ClusterFirst"
}
},
"strategy": {
"type": "RollingUpdate",
"rollingUpdate": {
"maxUnavailable": 1,
"maxSurge": 1
}
}
}
}
time="2021-02-03T10:50:32Z" level=debug msg="scaling deployment zzz-backrest-shared-repo to 1 for cluster zzz" func="internal/operator/cluster.ScaleClusterDeployments()" file="internal/operator/cluster/clusterlogic.go:764" version=4.6.0
And this is when it fails to create a new cluster for the restore:
time="2021-02-03T10:57:14Z" level=info msg="creating Pgcluster zzz in namespace pgo" func="internal/operator/cluster.getClusterDeploymentFields()" file="internal/operator/cluster/clusterlogic.go:263" version=4.6.0
time="2021-02-03T10:57:14Z" level=debug msg="creating exporter secret for cluster zzz" func="internal/operator/cluster.getClusterDeploymentFields()" file="internal/operator/cluster/clusterlogic.go:289" version=4.6.0
time="2021-02-03T10:57:14Z" level=info msg="exporter secret zzz-exporter-secret already present, will reuse" func="internal/operator/cluster.CreateExporterSecret()" file="internal/operator/cluster/exporter.go:145" version=4.6.0
time="2021-02-03T10:57:14Z" level=debug msg="[123 125]" func="internal/operator.GetPodSecurityContext()" file="internal/operator/common.go:188" version=4.6.0
time="2021-02-03T10:57:14Z" level=debug msg="GetPodAnitAffinity with clusterName=[zzz]" func="internal/operator.GetPodAntiAffinity()" file="internal/operator/clusterutilities.go:645" version=4.6.0
"podAntiAffinity": {
"preferredDuringSchedulingIgnoredDuringExecution": [
{
"weight": 1,
"podAffinityTerm": {
"labelSelector": {
"matchExpressions": [
{
"key": "vendor",
"operator": "In",
"values": [
"crunchydata"
]
},
{
"key": "pg-pod-anti-affinity",
"operator": "Exists"
},
{
"key": "pg-cluster",
"operator": "In",
"values": [
"zzz"
]
}
]
},
"topologyKey": "kubernetes.io/hostname"
}
}
]
}
"resources": {
"limits": {
"cpu": "10",
"memory": "1248Mi"
},
"requests": {
"cpu": "10",
"memory": "1248Mi"
}
},
time="2021-02-03T10:57:14Z" level=debug msg="pgo-custom-pg-config was not found, skipping global configMap" func="internal/operator.GetConfVolume()" file="internal/operator/clusterutilities.go:411" version=4.6.0
"resources": {
"requests": {
"memory": "24Mi"
}
},
{
"name": "exporter",
"image": "registry.developers.crunchydata.com/crunchydata/crunchy-postgres-exporter:centos8-4.6.0",
"ports": [{
"containerPort": 9187,
"protocol": "TCP"
}],
"resources": {
"requests": {
"memory": "24Mi"
}
},
"env": [
{
"name": "EXPORTER_PG_HOST",
"value": "127.0.0.1"
},
{
"name": "EXPORTER_PG_PORT",
"value": "5432"
},
{
"name": "EXPORTER_PG_DATABASE",
"value": "postgres"
},
{
"name": "EXPORTER_PG_PARAMS",
"value": "sslmode=disable"
},
{
"name": "JOB_NAME",
"value": "zzz"
},
{
"name": "POSTGRES_EXPORTER_PORT",
"value": "9187"
},
{
"name": "EXPORTER_PG_USER",
"valueFrom": {
"secretKeyRef": {
"name": "zzz-exporter-secret",
"key": "username"
}
}
},
{
"name": "EXPORTER_PG_PASSWORD",
"valueFrom": {
"secretKeyRef": {
"name": "zzz-exporter-secret",
"key": "password"
}
}
}
]
}
{
"kind": "Deployment",
"apiVersion": "apps/v1",
"metadata": {
"name": "zzz",
"labels": {
"vendor": "crunchydata",
"pgo-pg-database": "true",
"name": "zzz","pg-cluster": "zzz","deployment-name": "zzz","pgouser": "admin","crunchy-pgha-scope": "zzz","crunchy-postgres-exporter": "true","pgo-version": "4.6.0","workflowid": "105518a6-57ca-40cd-abca-b6fc250ae247"
}
},
"spec": {
"replicas": 0,
"selector": {
"matchLabels": {
"vendor": "crunchydata",
"pg-cluster": "zzz",
"pgo-pg-database": "true",
"deployment-name": "zzz"
}
},
"template": {
"metadata": {
"labels": {
"name": "zzz",
"vendor": "crunchydata",
"pgo-pg-database": "true",
"pg-pod-anti-affinity": "preferred",
"pgouser": "admin","crunchy-pgha-scope": "zzz","crunchy-postgres-exporter": "true","pgo-version": "4.6.0","workflowid": "105518a6-57ca-40cd-abca-b6fc250ae247","name": "zzz","pg-cluster": "zzz","deployment-name": "zzz"
}
},
"spec": {
"securityContext": {},
"serviceAccountName": "pgo-pg",
"containers": [
{
"name": "database",
"image": "registry.developers.crunchydata.com/crunchydata/crunchy-postgres-ha:centos8-12.5-4.6.0",
"readinessProbe": {
"exec": {
"command": [
"/opt/crunchy/bin/postgres-ha/health/pgha-readiness.sh"
]
},
"initialDelaySeconds": 15
},
"livenessProbe": {
"exec": {
"command": [
"/opt/crunchy/bin/postgres-ha/health/pgha-liveness.sh"
]
},
"initialDelaySeconds": 30,
"periodSeconds": 15,
"timeoutSeconds": 10
},
"resources": {
"limits": {
"cpu": "10",
"memory": "1248Mi"
},
"requests": {
"cpu": "10",
"memory": "1248Mi"
}
},
"env": [{
"name": "MODE",
"value": "postgres"
},
{
"name": "PGHA_PG_PORT",
"value": "5432"
}, {
"name": "PGHA_USER",
"value": "postgres"
},
{
"name": "PGHA_INIT",
"valueFrom": {
"configMapKeyRef": {
"name": "zzz-pgha-config",
"key": "init"
}
}
},
{
"name": "PATRONI_POSTGRESQL_DATA_DIR",
"value": "/pgdata/zzz"
},
{
"name": "PGBACKREST_STANZA",
"value": "db"
},
{
"name": "PGBACKREST_REPO1_HOST",
"value": "zzz-backrest-shared-repo"
},
{
"name": "BACKREST_SKIP_CREATE_STANZA",
"value": "true"
},
{
"name": "PGHA_PGBACKREST",
"value": "true"
},
{
"name": "PGBACKREST_REPO1_PATH",
"value": "/backrestrepo/zzz-backrest-shared-repo"
},
{
"name": "PGBACKREST_DB_PATH",
"value": "/pgdata/zzz"
},
{
"name": "ENABLE_SSHD",
"value": "true"
},
{
"name": "PGBACKREST_LOG_PATH",
"value": "/tmp"
},
{
"name": "PGBACKREST_PG1_SOCKET_PATH",
"value": "/tmp"
},
{
"name": "PGBACKREST_PG1_PORT",
"value": "5432"
},
{
"name": "PGBACKREST_REPO1_TYPE",
"value": "posix"
},
{
"name": "PGHA_PGBACKREST_LOCAL_S3_STORAGE",
"value": "false"
},
{
"name": "PGMONITOR_PASSWORD",
"valueFrom": {
"secretKeyRef": {
"name": "zzz-exporter-secret",
"key": "password"
}
}
},
{
"name": "PGHA_DATABASE",
"value": "pgo"
}, {
"name": "PGHA_REPLICA_REINIT_ON_START_FAIL",
"value": "true"
}, {
"name": "PGHA_SYNC_REPLICATION",
"value": "false"
}, {
"name": "PGHA_TLS_ENABLED",
"value": "false"
}, {
"name": "PGHA_TLS_ONLY",
"value": "false"
}, {
"name": "PGHA_STANDBY",
"value": "false"
}, {
"name": "PATRONI_KUBERNETES_NAMESPACE",
"valueFrom": {
"fieldRef": {
"fieldPath": "metadata.namespace"
}
}
}, {
"name": "PATRONI_KUBERNETES_SCOPE_LABEL",
"value": "crunchy-pgha-scope"
2021/02/03 10:57:14 INF 8 (localhost:4150) connecting to nsqd
}, {
"name": "PATRONI_SCOPE",
"valueFrom": {
"fieldRef": {
"fieldPath": "metadata.labels['crunchy-pgha-scope']"
}
}
}, {
"name": "PATRONI_KUBERNETES_LABELS",
"value": "{vendor: \"crunchydata\"}"
}, {
"name": "PATRONI_LOG_LEVEL",
"value": "INFO"
}, {
"name": "PGHOST",
"value": "/tmp"
}],
"volumeMounts": [{
"mountPath": "/pgdata",
"name": "pgdata",
"readOnly": false
}, {
"mountPath": "/pgconf/pguser",
"name": "user-volume"
}, {
"mountPath": "/pgconf/pgreplicator",
"name": "primary-volume"
}, {
"mountPath": "/pgconf/pgsuper",
"name": "root-volume"
},
{
"mountPath": "/sshd",
"name": "sshd",
"readOnly": true
}, {
"mountPath": "/pgconf",
"name": "pgconf-volume"
},
{
"mountPath": "/dev/shm",
"name": "dshm"
},
{
"mountPath": "/etc/pgbackrest/conf.d",
"name": "pgbackrest-config"
},
{
"mountPath": "/etc/podinfo",
"name": "podinfo"
}
],
"ports": [{
"containerPort": 5432,
"protocol": "TCP"
}, {
"containerPort": 8009,
"protocol": "TCP"
}],
"imagePullPolicy": "IfNotPresent"
}
,{
"name": "exporter",
"image": "registry.developers.crunchydata.com/crunchydata/crunchy-postgres-exporter:centos8-4.6.0",
"ports": [{
"containerPort": 9187,
"protocol": "TCP"
}],
"resources": {
"requests": {
"memory": "24Mi"
}
},
"env": [
{
"name": "EXPORTER_PG_HOST",
"value": "127.0.0.1"
},
{
"name": "EXPORTER_PG_PORT",
"value": "5432"
},
{
"name": "EXPORTER_PG_DATABASE",
"value": "postgres"
},
{
"name": "EXPORTER_PG_PARAMS",
"value": "sslmode=disable"
},
{
"name": "JOB_NAME",
"value": "zzz"
},
{
"name": "POSTGRES_EXPORTER_PORT",
"value": "9187"
},
{
"name": "EXPORTER_PG_USER",
"valueFrom": {
"secretKeyRef": {
"name": "zzz-exporter-secret",
"key": "username"
}
}
},
{
"name": "EXPORTER_PG_PASSWORD",
"valueFrom": {
"secretKeyRef": {
"name": "zzz-exporter-secret",
"key": "password"
}
}
}
]
}
],
"volumes": [{
"name": "pgdata",
"persistentVolumeClaim":{"claimName":"zzz"}
}, {
"name": "user-volume",
"secret": {
"secretName": "zzz-testuser-secret"
}
}, {
"name": "primary-volume",
"secret": {
"secretName": "zzz-primaryuser-secret"
}
}, {
"name": "sshd",
"secret": {
"secretName": "zzz-backrest-repo-config"
}
}, {
"name": "root-volume",
"secret": {
"secretName": "zzz-postgres-secret"
}
},
{
"name": "report",
"emptyDir": {
"medium": "Memory",
"sizeLimit": "64Mi"
}
},
{
"name": "dshm",
"emptyDir": {
"medium": "Memory"
}
},
{
"name": "pgbackrest-config",
"projected": { "sources": [] }
},
{
"name": "pgconf-volume",
"projected": {
"sources": [
{
"configMap": {
"name": "zzz-pgha-config",
"optional": true
}
}
]
}
},
{
"name": "podinfo",
"downwardAPI": {
"defaultMode": 420,
"items": [
{
"path": "cpu_limit",
"resourceFieldRef": {
"containerName": "database",
"divisor": "1m",
"resource": "limits.cpu"
}
},
{
"path": "cpu_request",
"resourceFieldRef": {
"containerName": "database",
"divisor": "1m",
"resource": "requests.cpu"
}
},
{
"path": "mem_limit",
"resourceFieldRef": {
"containerName": "database",
"resource": "limits.memory"
}
},
{
"path": "mem_request",
"resourceFieldRef": {
"containerName": "database",
"resource": "requests.memory"
}
},
{
"fieldRef": {
"apiVersion": "v1",
"fieldPath": "metadata.labels"
},
"path": "labels"
},
{
"fieldRef": {
"apiVersion": "v1",
"fieldPath": "metadata.annotations"
},
"path": "annotations"
}
]
}
}
],
"affinity": {
"nodeAffinity": {
"preferredDuringSchedulingIgnoredDuringExecution": [
{
"weight": 10,
"preference": {
"matchExpressions": [
{
"key": "nodetype",
"operator": "In",
"values": [
"production"
]
}
]
}
}
]
}
,
"podAntiAffinity": {
"preferredDuringSchedulingIgnoredDuringExecution": [
{
"weight": 1,
"podAffinityTerm": {
"labelSelector": {
"matchExpressions": [
{
"key": "vendor",
"operator": "In",
"values": [
"crunchydata"
]
},
{
"key": "pg-pod-anti-affinity",
"operator": "Exists"
},
{
"key": "pg-cluster",
"operator": "In",
"values": [
"zzz"
]
}
]
},
"topologyKey": "kubernetes.io/hostname"
}
}
]
}
},
"restartPolicy": "Always",
"dnsPolicy": "ClusterFirst"
}
},
"strategy": {
"type": "RollingUpdate",
"rollingUpdate": {
"maxUnavailable": 1,
"maxSurge": 1
}
}
}
}
time="2021-02-03T10:57:14Z" level=debug msg="publishing events.EventCreateClusterFailureFormat message Event Event CreateClusterFailure - ns [pgo] - user [admin] topics [[clustertopic]] timestamp [2021-02-03 10:57:14.340061022 +0000 UTC m=+549.943008142] - (create cluster failure) clustername zzz workflow 105518a6-57ca-40cd-abca-b6fc250ae247 error invalid character '{' looking for beginning of object key string" func="pkg/events.Publish()" file="pkg/events/eventing.go:48" version=4.6.0
time="2021-02-03T10:57:14Z" level=debug msg="header Event CreateClusterFailure - ns [pgo] - user [admin] topics [[clustertopic]] timestamp [2021-02-03 10:57:14.340061022 +0000 UTC m=+549.943008142] " func="pkg/events.Publish()" file="pkg/events/eventing.go:49" version=4.6.0
time="2021-02-03T10:57:14Z" level=debug msg="{\n \"eventheader\": {\n \"EventType\": \"CreateClusterFailure\",\n \"namespace\": \"pgo\",\n \"username\": \"admin\",\n \"timestamp\": \"2021-02-03T10:57:14.340061022Z\",\n \"topic\": [\n \"clustertopic\"\n ]\n },\n \"clustername\": \"zzz\",\n \"errormessage\": \"invalid character '{' looking for beginning of object key string\",\n \"workflowid\": \"105518a6-57ca-40cd-abca-b6fc250ae247\"\n}" func="pkg/events.Publish()" file="pkg/events/eventing.go:59" version=4.6.0
time="2021-02-03T10:57:14Z" level=error msg="pgtask Controller: invalid character '{' looking for beginning of object key string" func="internal/controller/pgtask.(*Controller).handleBackrestRestore()" file="internal/controller/pgtask/backresthandler.go:54" version=4.6.0
Made another test and restore works if i do not create a cluster with --replica-count=0
, i thought this was needed to avoid a replica. There's definitely a bug in operator caused by that argument being zero but easily fixes this issue if i do not use it.
The Operator does not create a replica by default unless you have set Replicas
in the pgo.yaml
file or your configuration to a nonzero value.
For argument sake, I took 4.6.0 on a Kube 1.19 cluster using the quickstart and ran through these scenarios
pgo create cluster hippo --replica-count=1
# wait for provisioning
pgo restore hippo --backup-opts="--set=20210203-143236F" --no-prompt
This correctly restored and came back up with one replica.
pgo create cluster rhino --replica-count=0
# wait for provisioning
pgo restore rhino --backup-opts="--set=20210203-143408F" --no-prompt
This correctly restored and came back with zero replicas
pgo create cluster zebra
# wait for provisioning
# a cluster with zero replicas came up
pgo restore zebra --backup-opts="--set=20210203-144716F" --no-prompt
This correctly restored and came back with zero replicas.
It does seem like there are some customizations in your environment or are you running in an environment that has multiple Operators installed?
Just using one operator in one namespace using the values described previously. Only fails when i set --replica-count=0
and the default is db_replicas: "0"
. Does not seem related to pgbackrest being local or S3. I'm using Kubernetes 1.20.
I also confirm this issue. Upgraded from pgo 4.5.0 to 4.6.1, I have the same problem when attempting a pgo restore
:
Nothing custom in config, the ConfigMap is generated by the crunchydata ansible roles.
operator time="2021-03-03T14:19:04Z" level=error msg="pgtask Controller: invalid character '{' looking for beginning of object key string" func="internal/controller/pgtask.(*Controller).handleBackrestRestore()" file="internal/controller/pgtask/backresthandler.go:54" version=4.6.1
Can you run with the CRUNCHY_DEBUG
environmental variable set to "true"
in the postgres-operator
deployment and show the output of the generated JSON (with any sensitive information redacted?).
pgo-new
cluster-name-backrest-shared-repo
. I had to create it manually, or it would not create a stanza or anything else. This is another separate bug. Previous versions created this automatically.pgo -n pgo-new restore pgo-new-x-postgres --no-prompt
time="2021-03-04T20:52:57Z" level=debug msg="task putting key in queue pgo-new/pgo-new-x-postgres-pgbackrestrestore" func="internal/controller/pgtask.(*Controller).onAdd()" file="internal/controller/pgtask/pgtaskcontroller.go:184" version=4.6.1
time="2021-03-04T20:52:57Z" level=debug msg="working on pgo-new/pgo-new-x-postgres-pgbackrestrestore" func="internal/controller/pgtask.(*Controller).processNextItem()" file="internal/controller/pgtask/pgtaskcontroller.go:78" version=4.6.1
time="2021-03-04T20:52:57Z" level=debug msg="queue got key ns=[pgo-new] resource=[pgo-new-x-postgres-pgbackrestrestore]" func="internal/controller/pgtask.(*Controller).processNextItem()" file="internal/controller/pgtask/pgtaskcontroller.go:83" version=4.6.1
time="2021-03-04T20:52:57Z" level=debug msg="workflow task added [pgo-new-x-postgres-pgbackrestrestore] ID [988537af-d772-440f-982c-6c14c8b5b81a]" func="internal/controller/pgtask.(*Controller).processNextItem()" file="internal/controller/pgtask/pgtaskcontroller.go:161" version=4.6.1
time="2021-03-04T20:52:57Z" level=debug msg="task putting key in queue pgo-new/backrest-restore-pgo-new-x-postgres" func="internal/controller/pgtask.(*Controller).onAdd()" file="internal/controller/pgtask/pgtaskcontroller.go:184" version=4.6.1
time="2021-03-04T20:52:57Z" level=debug msg="working on pgo-new/backrest-restore-pgo-new-x-postgres" func="internal/controller/pgtask.(*Controller).processNextItem()" file="internal/controller/pgtask/pgtaskcontroller.go:78" version=4.6.1
time="2021-03-04T20:52:57Z" level=debug msg="queue got key ns=[pgo-new] resource=[backrest-restore-pgo-new-x-postgres]" func="internal/controller/pgtask.(*Controller).processNextItem()" file="internal/controller/pgtask/pgtaskcontroller.go:83" version=4.6.1
time="2021-03-04T20:52:57Z" level=debug msg="backrest restore task added" func="internal/controller/pgtask.(*Controller).processNextItem()" file="internal/controller/pgtask/pgtaskcontroller.go:151" version=4.6.1
time="2021-03-04T20:52:57Z" level=debug msg="restore workflow: started for cluster pgo-new-x-postgres" func="internal/operator/backrest.PrepareClusterForRestore()" file="internal/operator/backrest/restore.go:123" version=4.6.1
time="2021-03-04T20:52:57Z" level=debug msg="patching cluster pgo-new-x-postgres: {\"metadata\":{\"annotations\":{\"pgo-backrest-restore\":\"\"},\"labels\":{\"deployment-name\":\"pgo-new-x-postgres\"}},\"spec\":{\"status\":\"\"},\"status\":{\"state\":\"pgcluster Restoring\",\"message\":\"Cluster is being restored\"}}" func="internal/operator/backrest.PrepareClusterForRestore()" file="internal/operator/backrest/restore.go:140" version=4.6.1
time="2021-03-04T20:52:57Z" level=debug msg="restore workflow: patched pgcluster pgo-new-x-postgres for restore" func="internal/operator/backrest.PrepareClusterForRestore()" file="internal/operator/backrest/restore.go:148" version=4.6.1
time="2021-03-04T20:52:57Z" level=debug msg="patching replica pgo-new-x-postgres-abae: {\"metadata\":{\"annotations\":{\"pgo-pgha-bootstrap-replica\":null}},\"spec\":{\"status\":\"\"},\"status\":{\"state\":\"pgcluster Restoring\",\"message\":\"Cluster is being restored\"}}" func="internal/operator/backrest.PrepareClusterForRestore()" file="internal/operator/backrest/restore.go:171" version=4.6.1
time="2021-03-04T20:52:57Z" level=debug msg="restore workflow: patched replicas in cluster pgo-new-x-postgres for restore" func="internal/operator/backrest.PrepareClusterForRestore()" file="internal/operator/backrest/restore.go:178" version=4.6.1
time="2021-03-04T20:52:57Z" level=debug msg="restore workflow: deleted primary and replica deployments for cluster pgo-new-x-postgres" func="internal/operator/backrest.PrepareClusterForRestore()" file="internal/operator/backrest/restore.go:197" version=4.6.1
time="2021-03-04T20:52:57Z" level=debug msg="restore workflow: finished waiting for primary and replica deployments for cluster pgo-new-x-postgres to be removed" func="internal/operator/backrest.PrepareClusterForRestore()" file="internal/operator/backrest/restore.go:213" version=4.6.1
time="2021-03-04T20:52:57Z" level=debug msg="restore workflow: deleted all existing jobs for cluster pgo-new-x-postgres" func="internal/operator/backrest.PrepareClusterForRestore()" file="internal/operator/backrest/restore.go:225" version=4.6.1
time="2021-03-04T20:52:57Z" level=debug msg="restore workflow: found PVCs [] for cluster pgo-new-x-postgres" func="internal/operator/backrest.PrepareClusterForRestore()" file="internal/operator/backrest/restore.go:233" version=4.6.1
time="2021-03-04T20:52:58Z" level=debug msg="restore workflow: finished waiting for PVCs for cluster pgo-new-x-postgres to be removed" func="internal/operator/backrest.PrepareClusterForRestore()" file="internal/operator/backrest/restore.go:259" version=4.6.1
time="2021-03-04T20:52:58Z" level=debug msg="restore workflow: deleted 'config' and 'leader' ConfigMaps for cluster pgo-new-x-postgres" func="internal/operator/backrest.PrepareClusterForRestore()" file="internal/operator/backrest/restore.go:273" version=4.6.1
time="2021-03-04T20:52:58Z" level=debug msg="patching configmap pgo-new-x-postgres-pgha-config: {\"data\":{\"init\":\"true\"}}" func="internal/operator/backrest.PrepareClusterForRestore()" file="internal/operator/backrest/restore.go:279" version=4.6.1
time="2021-03-04T20:52:58Z" level=debug msg="restore workflow: set 'init' flag to 'true' for cluster pgo-new-x-postgres" func="internal/operator/backrest.PrepareClusterForRestore()" file="internal/operator/backrest/restore.go:286" version=4.6.1
time="2021-03-04T20:52:58Z" level=debug msg="pgtask Controller: finished preparing cluster pgo-new-x-postgres for restore" func="internal/controller/pgtask.(*Controller).handleBackrestRestore()" file="internal/controller/pgtask/backresthandler.go:48" version=4.6.1
time="2021-03-04T20:52:58Z" level=debug msg="pgtask Controller: finished updating pgo-new-x-postgres spec for restore" func="internal/controller/pgtask.(*Controller).handleBackrestRestore()" file="internal/controller/pgtask/backresthandler.go:51" version=4.6.1
time="2021-03-04T20:52:58Z" level=info msg="found existing pgha ConfigMap for cluster pgo-new-x-postgres, setting init flag to 'true'" func="internal/operator/cluster.AddClusterBootstrap()" file="internal/operator/cluster/cluster.go:270" version=4.6.1
time="2021-03-04T20:52:58Z" level=debug msg="updating init value to true in the pgha configMap for cluster pgo-new-x-postgres" func="internal/operator.UpdatePGHAConfigInitFlag()" file="internal/operator/clusterutilities.go:920" version=4.6.1
time="2021-03-04T20:52:58Z" level=debug msg="in createPVC" func="internal/operator/pvc.Create()" file="internal/operator/pvc/pvc.go:138" version=4.6.1
time="2021-03-04T20:52:58Z" level=debug msg="using dynamic PVC template" func="internal/operator/pvc.Create()" file="internal/operator/pvc/pvc.go:152" version=4.6.1
{
"kind": "PersistentVolumeClaim",
"apiVersion": "v1",
"metadata": {
"name": "pgo-new-x-postgres",
"labels": {
"vendor": "crunchydata",
"pgremove": "true",
"pg-cluster": "pgo-new-x-postgres"
}
},
"spec": {
"accessModes": [
"ReadWriteOnce"
],
"storageClassName": "pgo-x-local-provisioner",
"resources": {
"requests": {
"storage": "500Mi"
}
}
}
}
time="2021-03-04T20:52:58Z" level=info msg="creating Pgcluster pgo-new-x-postgres in namespace pgo-new" func="internal/operator/cluster.getClusterDeploymentFields()" file="internal/operator/cluster/clusterlogic.go:264" version=4.6.1
time="2021-03-04T20:52:58Z" level=debug msg="creating exporter secret for cluster pgo-new-x-postgres" func="internal/operator/cluster.getClusterDeploymentFields()" file="internal/operator/cluster/clusterlogic.go:290" version=4.6.1
time="2021-03-04T20:52:58Z" level=info msg="exporter secret pgo-new-x-postgres-exporter-secret already present, will reuse" func="internal/operator/cluster.CreateExporterSecret()" file="internal/operator/cluster/exporter.go:145" version=4.6.1
time="2021-03-04T20:52:58Z" level=debug msg="[123 34 102 115 71 114 111 117 112 34 58 50 54 125]" func="internal/operator.GetPodSecurityContext()" file="internal/operator/common.go:188" version=4.6.1
time="2021-03-04T20:52:58Z" level=debug msg="GetPodAnitAffinity with clusterName=[pgo-new-x-postgres]" func="internal/operator.GetPodAntiAffinity()" file="internal/operator/clusterutilities.go:645" version=4.6.1
"podAntiAffinity": {
"preferredDuringSchedulingIgnoredDuringExecution": [
{
"weight": 1,
"podAffinityTerm": {
"labelSelector": {
"matchExpressions": [
{
"key": "vendor",
"operator": "In",
"values": [
"crunchydata"
]
},
{
"key": "pg-pod-anti-affinity",
"operator": "Exists"
},
{
"key": "pg-cluster",
"operator": "In",
"values": [
"pgo-new-x-postgres"
]
}
]
},
"topologyKey": "kubernetes.io/hostname"
}
}
]
}
"resources": {
"requests": {
"memory": "128Mi"
}
},
time="2021-03-04T20:52:58Z" level=debug msg="pgo-custom-pg-config was not found, skipping global configMap" func="internal/operator.GetConfVolume()" file="internal/operator/clusterutilities.go:411" version=4.6.1
"resources": {
"requests": {
"memory": "24Mi"
}
},
{
"name": "exporter",
"image": "registry.developers.crunchydata.com/crunchydata/crunchy-postgres-exporter:centos8-4.6.1",
"ports": [{
"containerPort": 9187,
"protocol": "TCP"
}],
"resources": {
"requests": {
"memory": "24Mi"
}
},
"env": [
{
"name": "EXPORTER_PG_HOST",
"value": "127.0.0.1"
},
{
"name": "EXPORTER_PG_PORT",
"value": "5432"
},
{
"name": "EXPORTER_PG_DATABASE",
"value": "postgres"
},
{
"name": "EXPORTER_PG_PARAMS",
"value": "sslmode=disable"
},
{
"name": "JOB_NAME",
"value": "pgo-new-x-postgres"
},
{
"name": "POSTGRES_EXPORTER_PORT",
"value": "9187"
},
{
"name": "EXPORTER_PG_USER",
"valueFrom": {
"secretKeyRef": {
"name": "pgo-new-x-postgres-exporter-secret",
"key": "username"
}
}
},
{
"name": "EXPORTER_PG_PASSWORD",
"valueFrom": {
"secretKeyRef": {
"name": "pgo-new-x-postgres-exporter-secret",
"key": "password"
}
}
}
]
}
{
"apiVersion": "batch/v1",
"kind": "Job",
"metadata": {
"name": "pgo-new-x-postgres-bootstrap",
"labels": {
"vendor": "crunchydata",
"pgo-backrest-job": "true",
"pgha-bootstrap": "pgo-new-x-postgres",
"name": "pgo-new-x-postgres","pg-cluster": "pgo-new-x-postgres","deployment-name": "pgo-new-x-postgres","pgouser": "admin","crunchy-pgha-scope": "pgo-new-x-postgres","pgo-version": "4.6.1","release": "x","workflowid": "84eca6fd-f513-4f31-80bd-b94a3d4098e1","crunchy-postgres-exporter": "true"
}
},
"spec": {
"template": {
"metadata": {
"labels": {
"name": "pgo-new-x-postgres-bootstrap",
"vendor": "crunchydata",
"pgha-bootstrap": "pgo-new-x-postgres",
"deployment-name": "pgo-new-x-postgres","pgouser": "admin","crunchy-pgha-scope": "pgo-new-x-postgres","pgo-version": "4.6.1","release": "x","name": "pgo-new-x-postgres","pg-cluster": "pgo-new-x-postgres","workflowid": "84eca6fd-f513-4f31-80bd-b94a3d4098e1","crunchy-postgres-exporter": "true"
}
},
"spec": {
"securityContext": {"fsGroup":26},
"serviceAccountName": "pgo-pg",
"containers": [{
"name": "database",
"image": "registry.developers.crunchydata.com/crunchydata/crunchy-postgres-ha:centos8-12.6-4.6.1",
"resources": {
"requests": {
"memory": "128Mi"
}
},
"env": [{
"name": "PGHA_PG_PORT",
"value": "5432"
}, {
"name": "PGHA_USER",
"value": "postgres"
},
{
"name": "PGHA_INIT",
"value": "true"
},
{
"name": "PGHA_BOOTSTRAP_METHOD",
"value": "pgbackrest_init"
},
{
"name": "PATRONI_POSTGRESQL_DATA_DIR",
"value": "/pgdata/pgo-new-x-postgres"
},
{
"name": "PGBACKREST_STANZA",
"value": "db"
},
{
"name": "PGBACKREST_REPO1_HOST",
"value": "pgo-new-x-postgres-backrest-shared-repo"
},
{
"name": "BACKREST_SKIP_CREATE_STANZA",
"value": "true"
},
{
"name": "PGHA_PGBACKREST",
"value": "true"
},
{
"name": "PGBACKREST_REPO1_PATH",
"value": "/backrestrepo/pgo-new-x-postgres-backrest-shared-repo"
},
{
"name": "PGBACKREST_DB_PATH",
"value": "/pgdata/pgo-new-x-postgres"
},
{
"name": "ENABLE_SSHD",
"value": "true"
},
{
"name": "PGBACKREST_LOG_PATH",
"value": "/tmp"
},
{
"name": "PGBACKREST_PG1_SOCKET_PATH",
"value": "/tmp"
},
{
"name": "PGBACKREST_PG1_PORT",
"value": "5432"
},
{
"name": "PGBACKREST_REPO1_TYPE",
"value": "posix"
},
{
"name": "PGHA_PGBACKREST_LOCAL_S3_STORAGE",
"value": "false"
},
{
"name": "PGHA_DATABASE",
"value": "postgres"
}, {
"name": "PGHA_REPLICA_REINIT_ON_START_FAIL",
"value": "true"
}, {
"name": "PGHA_SYNC_REPLICATION",
"value": "false"
}, {
"name": "PGHA_TLS_ENABLED",
"value": "false"
}, {
"name": "PGHA_TLS_ONLY",
"value": "false"
}, {
"name": "PGHA_STANDBY",
"value": "false"
}, {
"name": "PATRONI_KUBERNETES_NAMESPACE",
"valueFrom": {
"fieldRef": {
"fieldPath": "metadata.namespace"
}
}
}, {
"name": "PATRONI_KUBERNETES_SCOPE_LABEL",
"value": "crunchy-pgha-scope"
}, {
"name": "PATRONI_SCOPE",
"valueFrom": {
"fieldRef": {
"fieldPath": "metadata.labels['crunchy-pgha-scope']"
}
}
}, {
"name": "PATRONI_KUBERNETES_LABELS",
"value": "{vendor: \"crunchydata\"}"
}, {
"name": "PATRONI_LOG_LEVEL",
"value": "INFO"
}, {
"name": "PGHOST",
"value": "/tmp"
}, {
"name": "RESTORE_OPTS",
"value": " --repo-type=posix"
}],
"volumeMounts": [{
"mountPath": "/pgdata",
"name": "pgdata"
}, {
"mountPath": "/pgconf/pguser",
"name": "user-volume"
}, {
"mountPath": "/pgconf/pgreplicator",
"name": "primary-volume"
}, {
"mountPath": "/pgconf/pgsuper",
"name": "root-volume"
},
{
"mountPath": "/sshd",
"name": "sshd",
"readOnly": true
}, {
"mountPath": "/pgconf",
"name": "pgconf-volume"
}, {
"mountPath": "/dev/shm",
"name": "dshm"
}, {
"mountPath": "/etc/pgbackrest/conf.d",
"name": "pgbackrest-config"
}
],
"imagePullPolicy": "IfNotPresent"
}],
"volumes": [{
"name": "pgdata",
"persistentVolumeClaim":{"claimName":"pgo-new-x-postgres"}
}, {
"name": "user-volume",
"secret": {
"secretName": "pgo-new-x-postgres-postgres-secret"
}
}, {
"name": "primary-volume",
"secret": {
"secretName": "pgo-new-x-postgres-primaryuser-secret"
}
}, {
"name": "root-volume",
"secret": {
"secretName": "pgo-new-x-postgres-postgres-secret"
}
}, {
"name": "sshd",
"secret": {
"secretName": "pgo-new-x-postgres-backrest-repo-config"
}
},
{
"name": "dshm",
"emptyDir": {
"medium": "Memory"
}
},
{
"name": "pgbackrest-config",
"projected": { "sources": [] }
},
{
"name": "pgconf-volume",
"projected": {
"sources": [
{
"configMap": {
"name": "pgo-new-x-postgres-pgha-config",
"optional": true
}
}
]
}
}
],
"affinity": {
{
"preferredDuringSchedulingIgnoredDuringExecution": [
{
"weight": 10,
"preference": {
"matchExpressions": [
{
"key": "kubernetes.io/hostname",
"operator": "In",
"values": [
"revo-nuc1"
]
}
]
}
}
]
}
,
"podAntiAffinity": {
"preferredDuringSchedulingIgnoredDuringExecution": [
{
"weight": 1,
"podAffinityTerm": {
"labelSelector": {
"matchExpressions": [
{
"key": "vendor",
"operator": "In",
"values": [
"crunchydata"
]
},
{
"key": "pg-pod-anti-affinity",
"operator": "Exists"
},
{
"key": "pg-cluster",
"operator": "In",
"values": [
"pgo-new-x-postgres"
]
}
]
},
"topologyKey": "kubernetes.io/hostname"
}
}
]
}
},
"restartPolicy": "Never"
}
}
}
}
time="2021-03-04T20:52:58Z" level=debug msg="publishing events.EventCreateClusterFailureFormat message Event Event CreateClusterFailure - ns [pgo-new] - user [admin] topics [[clustertopic]] timestamp [2021-03-04 20:52:58.376086248 +0000 UTC m=+359.286420226] - (create cluster failure) clustername pgo-new-x-postgres workflow 84eca6fd-f513-4f31-80bd-b94a3d4098e1 error invalid character '{' looking for beginning of object key string" func="pkg/events.Publish()" file="pkg/events/eventing.go:45" version=4.6.1
time="2021-03-04T20:52:58Z" level=debug msg="header Event CreateClusterFailure - ns [pgo-new] - user [admin] topics [[clustertopic]] timestamp [2021-03-04 20:52:58.376086248 +0000 UTC m=+359.286420226] " func="pkg/events.Publish()" file="pkg/events/eventing.go:46" version=4.6.1
time="2021-03-04T20:52:58Z" level=debug msg="{\n \"eventheader\": {\n \"EventType\": \"CreateClusterFailure\",\n \"namespace\": \"pgo-new\",\n \"username\": \"admin\",\n \"timestamp\": \"2021-03-04T20:52:58.376086248Z\",\n \"topic\": [\n \"clustertopic\"\n ]\n },\n \"clustername\": \"pgo-new-x-postgres\",\n \"errormessage\": \"invalid character '{' looking for beginning of object key string\",\n \"workflowid\": \"84eca6fd-f513-4f31-80bd-b94a3d4098e1\"\n}" func="pkg/events.Publish()" file="pkg/events/eventing.go:56" version=4.6.1
2021/03/04 20:52:58 INF 7 (localhost:4150) connecting to nsqd
time="2021-03-04T20:52:58Z" level=error msg="pgtask Controller: invalid character '{' looking for beginning of object key string" func="internal/controller/pgtask.(*Controller).handleBackrestRestore()" file="internal/controller/pgtask/backresthandler.go:54" version=4.6.1
Thanks. I was able to successfully reproduce this.
The issue is related to the use of node affinity. The bootstrap job was not properly updated to account for a change. This affects only the 4.6 series. I will patch the templates to ensure this is properly accounted for.
If you need an immediate fix, you can do the following:
pgo-config
ConfigMap in the same namespace as the Operatorcluster-bootstrap-job.json
entryMake this change:
- {{.NodeSelector}}
+ {{if .NodeSelector}}
+ "nodeAffinity": {{.NodeSelector}}
+ {{ end }}
With regards to:
A note here: it failed to create the backrest repo directory in the form cluster-name-backrest-shared-repo. I had to create it manually, or it would not create a stanza or anything else. This is another separate bug. Previous versions created this automatically.
I was unable to reproduce anything like that. I would need either the command you used to create the cluster along with logs that demonstrate a failure.
The fix is merged and will appear in 4.6.2. The fix is effectively what I showed in https://github.com/CrunchyData/postgres-operator/issues/2251#issuecomment-790948496 so that will allow for you to get by in the interim. Thanks for reporting and troubleshooting!
Describe the bug Create a cluster with no replicas with S3 storage (minio) for backups. Try to restore to a previous state.
zzz is stopped and removed
Logs from postgres-operator
After this nothing happens. Only way to recover was to create another cluster in standby and promote it. This allowed me to recover before
restore
failure but i do not known how to return to a previous backup.To Reproduce Steps to reproduce the behavior:
start with a clean db
make db backup
do some db changes and make another backup and try to restore before changes
Expected behavior DB should restart with the previous db state
Please tell us about your environment:
zzz-tune-config: