timescale / helm-charts

Configuration and Documentation to run TimescaleDB in your Kubernetes cluster
Apache License 2.0
263 stars 223 forks source link

[S3 backup not working] #415

Closed Codestar0609 closed 1 year ago

Codestar0609 commented 2 years ago

Have a look at Troubleshooting or some Common Issues

Describe the bug I setup timescaledb with s3 backup enabled. However backup is not working

To Reproduce Setup timescaledb with helm chart timescaledb-single in EKS

Deployment

# This file and its contents are licensed under the Apache License 2.0.
# Please see the included NOTICE for copyright information and LICENSE for a copy of the license.

replicaCount: 3

# To prevent very long names, we override the name, otherwise it would default to
# timescaledb-single (the name of the chart)
nameOverride: timescaledb

# The default Patroni name of the cluster ("scope") is derived from the name of the release,
# but you can override this behaviour here
# https://patroni.readthedocs.io/en/latest/SETTINGS.html#global-universal
clusterName:

# The major PostgreSQL version to use, defaults to the default version of the Docker image
# However, in pg_upgrade scenarios, you may need to specify an explicit version
version:

image:
  # Image was built from
  # https://github.com/timescale/timescaledb-docker-ha
  repository: timescale/timescaledb-ha
  tag: pg13.4-ts2.4.2-p0
  pullPolicy: Always

# By default those secrets are randomly generated.
# To prevent misconfiguration, modifications from helm upgrade won't be applied to those secrets.
# As a result changing secrets cannot be done via helm and need manual intervention.
secrets:
  # This map should contain environment variables that influence Patroni,
  # for example PATRONI_SUPERUSER_PASSWORD or PATRONI_REPLICATION_PASSWORD
  # https://patroni.readthedocs.io/en/latest/ENVIRONMENT.html#postgresql
  credentials:
    PATRONI_SUPERUSER_PASSWORD: "*****"
    PATRONI_REPLICATION_PASSWORD: "****"
    PATRONI_admin_PASSWORD: "*****"

  # Selector used to provision your own Secret containing patroni configuration details
  # This is mutually exclusive with `credentials` option and takes precedence over it.
  # WARNING: Use this option with caution  
  credentialsSecretName: ""

  # This map should contain tls key and certifiacte. When empty,
  # helm will generate self-signed certificate.
  certificate:
    tls.crt: ""
    tls.key: ""

  # Selector used to provision your own Secret containing certificate details.
  # This is mutually exclusive with `certificate` option and takes precedence over it.
  # WARNING: Use this option with caution
  certificateSecretName: ""

  # This secret should contain environment variables that influence pgBackRest.
  pgbackrest:
    PGBACKREST_REPO1_S3_REGION: ""
    PGBACKREST_REPO1_S3_KEY: ""
    PGBACKREST_REPO1_S3_KEY_SECRET: ""
    PGBACKREST_REPO1_S3_BUCKET: ""
    PGBACKREST_REPO1_S3_ENDPOINT: "s3.amazonaws.com"

  # Selector used to provision your own Secret containing pgbackrest configuration details
  # This is mutually exclusive with `pgbackrest` option and takes precedence over it.
  # WARNING: Use this option with caution
  pgbackrestSecretName: ""

backup:
  enabled: false
  pgBackRest:
    # https://pgbackrest.org/configuration.html
    # Although not impossible, care should be taken not to include secrets
    # in these parameters. Use Kubernetes Secrets to specify S3 Keys, Secrets etc.
    compress-type: lz4
    process-max: 4
    start-fast: "y"
    repo1-retention-diff: 2
    repo1-retention-full: 2
    repo1-type: s3
    repo1-cipher-type: "none"
    repo1-s3-region: us-east-2
    repo1-s3-endpoint: s3.amazonaws.com

  # Overriding the archive-push/archive-get sections is most useful in
  # very high througput situations. Look at values/high_throuhgput_example.yaml for more details
  pgBackRest:archive-push: {}
  pgBackRest:archive-get: {}
  jobs:
      # name: needs to adhere to the kubernetes restrictions
      # type: can be full, incr or diff, see https://pgbackrest.org/user-guide.html
      # schedule: https://en.wikipedia.org/wiki/Cron#CRON_expression
    - name: full-weekly
      type: full
      schedule: "12 02 * * 0"
    - name: incremental-daily
      type: incr
      schedule: "12 02 * * 1-6"
  # Extra custom environment variables for the backup container.
  envFrom:
  # - secretRef:
  #     name: extra-pgbackrest-secrets

  # Alternatively, you can expose individual environment variables:
  # https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.16/#envvar-v1-core
  # Although not impossible, care should be taken not to include secrets
  # in these parameters. Use Kubernetes Secrets to specify S3 Keys, Secrets etc.
  env:
  # - name: PGBACKREST_REPO1_S3_BUCKET
  #   value: my_example_s3_bucket_for_backups
  # - name: PGBACKREST_REPO1_S3_KEY_SECRET
  #   valueFrom:
  #     secretKeyRef:
  #       name: pgbackrest-dev-secrets
  #       key: repo1-s3-key-secret

# When creating a *new* deployment, the default is to initialize (using initdb) the database.
# If however, you want to initialize the database using an existing backup, you can do so by
# configuring this section.
#
# WARNING: You *should not* run 2 identically named deployments in separate Kubernetes
#          clusters using the same S3 bucket for backups.
bootstrapFromBackup:
  enabled: false
  # Setting the s3 path is mandatory to avoid overwriting an already existing backup,
  # and to be sure the restore is explicitly the one requested.
  repo1-path:
  # Here you can (optionally) provide a Secret to configure the restore process further.
  # For example, if you need to specify a different restore bucket, you should set
  # PGBACKREST_REPO1_S3_BUCKET: <base64 encoded value of the bucket> in these secrets
  secretName: pgbackrest-bootstrap

# Extra custom environment variables.
# These should be an EnvVar, as this allows you to inject secrets into the environment
# https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.16/#envvar-v1-core
env:
#  - name: NOT_A_SECRET
#    value: "test"
#  - name: MYAPPLICATION_STANDBY_PASSWORDS
#    valueFrom:
#      secretKeyRef:
#        name: myapplication-passwords
#        key: standby

# Externally created Kubernetes secrets will be injected into the pods by referencing them here. You
# can also add more configuration options and secrets this way (see https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/#configure-all-key-value-pairs-in-a-configmap-as-container-environment-variables)
envFrom:
#  - configMapRef:
#      name: my-deployment-settings
#      optional: true

# This configuration will be passed on to Patroni directly, there are a few things that are
# injected/changed, these are:
#   - archive_command will be set to /bin/true if backup is disabled
#   - any context sensitive parameter (scope, namespace, name) will be overridden by the Kubernetes context
# https://patroni.readthedocs.io/en/latest/SETTINGS.html#settings
patroni:
  log:
    level: WARNING
  # https://patroni.readthedocs.io/en/latest/replica_bootstrap.html#bootstrap
  bootstrap:
    method: restore_or_initdb
    restore_or_initdb:
      command: >
        /etc/timescaledb/scripts/restore_or_initdb.sh
        --encoding=UTF8
        --locale=C.UTF-8
      keep_existing_recovery_conf: true
    post_init: /etc/timescaledb/scripts/post_init.sh
    dcs:
      loop_wait: 10
      maximum_lag_on_failover: 33554432
      postgresql:
        parameters:
          archive_command: "/etc/timescaledb/scripts/pgbackrest_archive.sh %p"
          archive_mode: 'on'
          archive_timeout: 1800s
          #
          # Autovacuuming is very important to PostgreSQL. For TimescaleDB, in
          # most usecases the vacuuming part is of less importance (there are no deleted tuples to prune)
          # however, the autoanalyze bit (updating the statistics of the chunks) is important to help
          # in planning queries. Therefore we do some tuning of autovacuum to address these
          # TimescaleDB specific concerns.
          # We'd rather have autovacuum do things early, as this increases the changes that autovacuum
          # will find the buffers it needs in shared_buffers, instead of having to fetch them from disk.
          #
          autovacuum_analyze_scale_factor: 0.02
          # This allows us to auto-analyze at most 120 (pretty much empty) chunks every 5 seconds
          # This will ensure that we can have up-to-date statistics on inserts very, very quickly
          autovacuum_naptime: 5s
          autovacuum_max_workers: 10
          # We don't want vacuum work to be building up, therefore we increase
          # the cost limit so that the work to be done for vacuum will be done quickly.
          autovacuum_vacuum_cost_limit: 500
          autovacuum_vacuum_scale_factor: 0.05
          log_autovacuum_min_duration: 1min
          hot_standby: 'on'
          log_checkpoints: 'on'
          log_connections: 'on'
          log_disconnections: 'on'
          log_line_prefix: "%t [%p]: [%c-%l] %u@%d,app=%a [%e] "
          log_lock_waits: 'on'
          log_min_duration_statement: '1s'
          log_statement: ddl
          max_connections: 100
          max_prepared_transactions: 150
          shared_preload_libraries: timescaledb,pg_stat_statements
          ssl: 'on'
          ssl_cert_file: '/etc/certificate/tls.crt'
          ssl_key_file: '/etc/certificate/tls.key'
          tcp_keepalives_idle: 900
          tcp_keepalives_interval: 100
          temp_file_limit: 1GB
          timescaledb.passfile: '../.pgpass'
          unix_socket_directories: "/var/run/postgresql"
          unix_socket_permissions: '0750'
          wal_level: hot_standby
          wal_log_hints: 'on'
        use_pg_rewind: true
        use_slots: true
      retry_timeout: 10
      ttl: 30
  kubernetes:
    role_label: role
    scope_label: cluster-name
    use_endpoints: true
  postgresql:
    create_replica_methods:
    - pgbackrest
    - basebackup
    pgbackrest:
      command: /etc/timescaledb/scripts/pgbackrest_restore.sh
      keep_data: true
      no_params: true
      no_master: true
    basebackup:
    - waldir: "/var/lib/postgresql/wal/pg_wal"
    recovery_conf:
      restore_command: /etc/timescaledb/scripts/pgbackrest_archive_get.sh %f "%p"
    callbacks:
      on_role_change: /etc/timescaledb/scripts/patroni_callback.sh
      on_start: /etc/timescaledb/scripts/patroni_callback.sh
      on_reload: /etc/timescaledb/scripts/patroni_callback.sh
      on_restart: /etc/timescaledb/scripts/patroni_callback.sh
      on_stop: /etc/timescaledb/scripts/patroni_callback.sh
    authentication:
      replication:
        username: standby
      superuser:
        username: postgres
    listen: 0.0.0.0:5432
    pg_hba:
    - local     all             postgres                              peer
    - local     all             all                                   md5
    - hostnossl all,replication all                all                md5
    - hostssl   all             all                127.0.0.1/32       md5
    - hostssl   all             all                ::1/128            md5
    - hostssl   replication     standby            all                md5
    - hostssl   all             all                all                md5
    use_unix_socket: true
  restapi:
    listen: 0.0.0.0:8008

callbacks:
  # If set, this configMap will be used for the Patroni callbacks.
  configMap:  # example-patroni-callbacks

postInit:
  # A list of sources, that contain post init scripts.
  # https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.18/#projectedvolumesource-v1-core
  # These scripts are all projected to the same directory and will be executed
  # in sorted order only once: After a cluster initialization
  # Some examples:
  - configMap:
      name: custom-init-scripts
      optional: true
  - secret:
      name: custom-secret-scripts
      optional: true

# Values for defining the primary & replica Kubernetes Services.
service:
  primary:
    # One of (ClusterIP | LoadBalancer | NodePort). Headless services are not supported.
    type: ClusterIP
    # The port used by the service.
    port: 5432
    # Optional NodePort, only used for type `NodePort`.
    nodePort: null
    # Additional labels to be added to the Service.
    labels: {}
    # Additional annotations to be added to the Service.
    annotations: {}
    # Define extra fields to be interpolated into the Service spec.
    #
    # This allows for adding support for new features and functionality which may not yet
    # be directly supported in this chart.
    spec: {}
    # loadBalancerSourceRanges:
    # - "0.0.0.0/0"

  replica:
    # One of (ClusterIP | LoadBalancer | NodePort). Headless services are not supported.
    type: ClusterIP
    # The port used by the service.
    port: 5432
    # Optional NodePort, only used for type `NodePort`.
    nodePort: null
    # Additional labels to be added to the Service.
    labels: {}
    # Additional annotations to be added to the Service.
    annotations: {}
    # Define extra fields to be interpolated into the Service spec.
    #
    # This allows for adding support for new features and functionality which may not yet
    # be directly supported in this chart.
    spec: {}
    # loadBalancerSourceRanges:
    # - "0.0.0.0/0"

# DEPRECATED(0.10.0): use the `service.primary` values instead.
loadBalancer:
  # If not enabled, we still expose the primary using a so called Headless Service
  # https://kubernetes.io/docs/concepts/services-networking/service/#headless-services
  enabled: true
  port: 5432
  # Read more about the AWS annotations here:
  # https://kubernetes.io/docs/concepts/cluster-administration/cloud-providers/#aws
  # https://docs.aws.amazon.com/eks/latest/userguide/load-balancing.html
  annotations:
    # Setting idle-timeout to the maximum allowed value, as in general
    # database connections are long lived
    service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout: "4000"

    # service.beta.kubernetes.io/aws-load-balancer-type: nlb            # Use an NLB instead of ELB
    # service.beta.kubernetes.io/aws-load-balancer-internal: 0.0.0.0/0  # Internal Load Balancer
  # Define extra things that should go in the service spec
  # https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.17/#servicespec-v1-core
  spec:
  # loadBalancerSourceRanges:
  # - "0.0.0.0/0"

replicaLoadBalancer:
  # If not enabled, we still expose the replica's using a so called Headless Service
  # https://kubernetes.io/docs/concepts/services-networking/service/#headless-services
  enabled: false
  port: 5432
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout: "4000"
  # Define extra things that should go in the service spec
  # https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.17/#servicespec-v1-core
  spec:
  # loadBalancerSourceRanges:
  # - "0.0.0.0/0"

readinessProbe:
  enabled: true
  initialDelaySeconds: 5
  periodSeconds: 30
  timeoutSeconds: 5
  failureThreshold: 6
  successThreshold: 1

persistentVolumes:
  # For sanity reasons, the actual PGDATA and wal directory will be subdirectories of the Volume mounts,
  # this allows Patroni/a human/an automated operator to move directories during bootstrap, which cannot
  # be done if we did not use subdirectories
  # https://www.postgresql.org/docs/current/creating-cluster.html#CREATING-CLUSTER-MOUNT-POINTS
  data:
    enabled: true
    size: 2Gi
    ## database data Persistent Volume Storage Class
    ## If defined, storageClassName: <storageClass>
    ## If set to "-", storageClassName: "", which disables dynamic provisioning
    ## If undefined (the default) or set to null, no storageClassName spec is
    ##   set, choosing the default provisioner.  (gp2 on AWS, standard on
    ##   GKE, AWS & OpenStack)
    ##
    # storageClass: "-"
    subPath: ""
    mountPath: "/var/lib/postgresql"
    annotations: {}
    accessModes:
      - ReadWriteOnce
  # WAL will be a subdirectory of the data volume, which means enabling a separate
  # volume for the WAL files should just work for new pods.
  wal:
    enabled: true
    size: 1Gi
    subPath: ""
    storageClass:
    # When changing this mountPath ensure you also change the following key to reflect this:
    # patroni.postgresql.basebackup.[].waldir
    mountPath: "/var/lib/postgresql/wal"
    annotations: {}
    accessModes:
      - ReadWriteOnce
  # Any tablespace mentioned here requires a volume that will be associated with it.
  # tablespaces:
    # example1:
    #   size: 5Gi
    #   storageClass: gp2
    # example2:
    #   size: 5Gi
    #   storageClass: gp2

# EXPERIMENTAL, please do *not* enable on production environments
# if enabled, fullWalPrevention will switch the default transaction mode from read write
# to read only if thresholds are breached.
fullWalPrevention:
  enabled: false
  checkFrequency: 30
  # To prevent the default transaction mode from switching constantly, we have separate
  # thresholds for switching to read-only and read-write
  thresholds:
    readOnlyFreePercent: 5
    readOnlyFreeMB: 64
    readWriteFreePercent: 8
    readWriteFreeMB: 128

resources: {}
  # If you do want to specify resources, uncomment the following
  # lines, adjust them as necessary, and remove the curly braces after 'resources:'.
  # limits:
  #   cpu: 100m
  #   memory: 128Mi
  # requests:
  #   cpu: 100m
  #   memory: 128Mi

sharedMemory:
  # By default Kubernetes only provides 64MB to /dev/shm
  # /dev/shm is only used by PostgreSQL for work_mem for parallel workers,
  # so most will not run into this issue.
  # https://github.com/kubernetes/kubernetes/issues/28272
  #
  # If you do however run into:
  #
  #   SQLSTATE 53100
  #   ERROR:  could not resize shared memory segment "/PostgreSQL.12345" to 4194304 bytes:
  #   No space left on device
  #
  # you may wish to use a mount to Memory, by setting useMount to true
  useMount: false

# timescaledb-tune will be run with the Pod resources requests or - if not set - its limits.
# This should give a reasonably tuned PostgreSQL instance.
# Any PostgreSQL parameter that is explicitly set in the Patroni configuration will override
# the auto-tuned variables.
timescaledbTune:
  enabled: true
  # For full flexibility, we allow you to override any timescaledb-tune parameter below.
  # However, these parameters only take effect on newly scheduled pods and their settings are
  # only visibible inside those new pods.
  # Therefore you probably want to set explicit overrides in patroni.bootstrap.dcs.postgresql.parameters,
  # as those will take effect as soon as possible.
  # https://github.com/timescale/timescaledb-tune
  args: {}
    # max-conns: 120
    # cpus: 5
    # memory: 4GB

# pgBouncer does connection pooling for PostgreSQL
# https://www.pgbouncer.org/
# enabling pgBouncer will run an extra container in every Pod, serving a pgBouncer
# pass-through instance
pgBouncer:
  enabled: false
  port: 6432
  config:
  # DANGER: The below settings are considered to be safe to set, and we recommend
  # you do set these to appropriate values for you.
  # However, for flexibility, we do allow the override of any pg_bouncer setting
  # many of which are vital to the operation of this helm chart.
  # The values we do not suggest altering are set in the template
  # https://github.com/timescale/timescaledb-kubernetes/blob/master/charts/timescaledb-single/templates/configmap-pgbouncer.yaml#L35-L50
  # Only override these settings if you are confident of  what you are doing.
    server_reset_query: DISCARD ALL
    max_client_conn: 500
    default_pool_size: 12
    pool_mode: transaction
  pg_hba:
  - local     all postgres                   peer
  - host      all postgres,standby 0.0.0.0/0 reject
  - host      all postgres,standby ::0/0     reject
  - hostssl   all all              0.0.0.0/0 md5
  - hostssl   all all              ::0/0     md5
  - hostnossl all all              0.0.0.0/0 reject
  - hostnossl all all              ::0/0     reject
  # Secret should contain user/password pairs in the format expected by pgbouncer
  # https://www.pgbouncer.org/config.html#authentication-file-format
  # example:
  # userlist.txt: |
  #   "username" "hashedpassword"
  #   "username2" "hashedpassword2"
  userListSecretName:

networkPolicy:
  enabled: false
  prometheusApp: prometheus
  # Below you can specify a whitelist of Ingress rules, for more information:
  # https://kubernetes.io/docs/concepts/services-networking/network-policies/#the-networkpolicy-resource
  ingress:
  # - from:
  #   - podSelector:
  #       matchLabels:
  #         app: foo
  #   ports:
  #   - protocol: TCP
  #       port: 11111

# https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#nodeselector
nodeSelector: {}

# Prometheus exporter for PostgreSQL server metrics.
# https://github.com/wrouesnel/postgres_exporter
prometheus:
  enabled: false
  image:
    repository: wrouesnel/postgres_exporter
    tag: v0.7.0
    pullPolicy: Always
  # Extra custom environment variables for prometheus.
  # These should be an EnvVar, as this allows you to inject secrets into the environment
  # https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.16/#envvar-v1-core
  env:
  # - name: NOT_A_SECRET
  #   value: "test"
  # - name: MYAPPLICATION_STANDBY_PASSWORDS
  #   valueFrom:
  #     secretKeyRef:
  #       name: myapplication-passwords
  #       key: standby
  # Additional volumes for prometheus, e.g., to support additional queries.
  # These should be a Volume, as this allows you to inject any kind of Volume
  # https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.16/#volume-v1-core
  volumes:
  # - name: exporter-config
  #   configMap:
  #     name: exporter-prometheus
  #     items:
  #       - key: metrics_queries
  #         path: queries.yaml
  # Additional volume mounts, to be used in conjunction with the above variable.
  # https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.16/#volumemount-v1-core
  volumeMounts:
  # - name: exporter-config
  #   mountPath: /var/exporter

serviceMonitor:
  # Specifies whether ServiceMonitor for Prometheus operator should be created
  enabled: false
  portName: metrics
  path: /metrics
  interval: 10s
  # scrapeTimeout: 30s
  # Specifies namespace, where ServiceMonitor should be installed
  # namespace: monitoring
  # labels:
  #   release: prometheus
  # honorLabels: true
  # metricRelabelings: []
  # targetLabels:
  #   - foo

# For new deployments, we would advise Parallel here, however as that change breaks previous
# deployments, it is set to OrderedReady here
podManagementPolicy: OrderedReady

# Annotations that are applied to each pod in the stateful set
# https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/
podAnnotations: {}

# https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/
tolerations: []

# https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
affinityTemplate: |
  podAntiAffinity:
    preferredDuringSchedulingIgnoredDuringExecution:
    - weight: 100
      podAffinityTerm:
        topologyKey: "kubernetes.io/hostname"
        labelSelector:
          matchLabels:
            app: {{ template "timescaledb.fullname" . }}
            release: {{ .Release.Name | quote }}
            cluster-name: {{ template "clusterName" . }}
    - weight: 50
      podAffinityTerm:
        topologyKey: failure-domain.beta.kubernetes.io/zone
        labelSelector:
          matchLabels:
            app: {{ template "timescaledb.fullname" . }}
            release: {{ .Release.Name | quote }}
            cluster-name: {{ template "clusterName" . }}
affinity: {}

## Use an alternate scheduler, e.g. "stork".
## ref: https://kubernetes.io/docs/tasks/administer-cluster/configure-multiple-schedulers/
##
# schedulerName:

rbac:
  # Specifies whether RBAC resources should be created
  create: true

serviceAccount:
  # Specifies whether a ServiceAccount should be created
  create: true
  # The name of the ServiceAccount to use.
  # If not set and create is true, a name is generated using the fullname template
  name:

debug:
  # This setting is mainly for during development, debugging or troubleshooting.
  # This command will be executed *before* the main container starts. In the
  # example below, we can mimick a slow restore by sleeping for 5 minutes before starting
  execStartPre:  # sleep 300

}

- What is your Kubernetes Environment (for exampe: GKE, EKS, minikube, microk8s)
EKS

**Deployment**
Please share some details of what is in your Kubernetes environment, for example:

kubectl get all,secret,configmap,endpoints,pvc -L role -n nibbl-timescaledb-prod
NAME READY STATUS RESTARTS AGE ROLE pod/nibbl-timescaledb-prod-0 2/2 Running 0 15d replica pod/nibbl-timescaledb-prod-1 2/2 Running 0 15d master pod/nibbl-timescaledb-prod-2 2/2 Running 0 15d replica pod/nibbl-timescaledb-prod-incremental-daily-27656772-9b5sp 0/1 Completed 0 12d
pod/nibbl-timescaledb-prod-incremental-daily-27662532-9ss8v 0/1 Completed 0 8d
pod/nibbl-timescaledb-prod-incremental-daily-27666852-bvpgv 0/1 Completed 0 5d3h

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE ROLE service/nibbl-timescaledb-prod LoadBalancer 172.20.139.143 a13680a15d58945fab9468fded709ff6-755650302.ap-southeast-1.elb.amazonaws.com 5432:32699/TCP 15d master service/nibbl-timescaledb-prod-backup ClusterIP 172.20.250.235 8081/TCP 15d
service/nibbl-timescaledb-prod-config ClusterIP None 8008/TCP 15d
service/nibbl-timescaledb-prod-replica ClusterIP 172.20.108.55 5432/TCP 15d replica

NAME READY AGE ROLE statefulset.apps/nibbl-timescaledb-prod 3/3 15d

NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE ROLE cronjob.batch/nibbl-timescaledb-prod-full-weekly 12 02 0 False 0 3h13m 15d
cronjob.batch/nibbl-timescaledb-prod-incremental-daily 12 02 1-6 False 0 27h 15d

NAME COMPLETIONS DURATION AGE ROLE job.batch/nibbl-timescaledb-prod-full-weekly-27674052 0/1 3h13m 3h13m
job.batch/nibbl-timescaledb-prod-incremental-daily-27656772 1/1 7s 12d
job.batch/nibbl-timescaledb-prod-incremental-daily-27662532 1/1 7s 8d
job.batch/nibbl-timescaledb-prod-incremental-daily-27666852 1/1 7s 5d3h
job.batch/nibbl-timescaledb-prod-incremental-daily-27672612 0/1 27h 27h

NAME TYPE DATA AGE ROLE secret/default-token-kwnjr kubernetes.io/service-account-token 3 36d
secret/nibbl-timescaledb-prod-certificate kubernetes.io/tls 2 15d
secret/nibbl-timescaledb-prod-credentials Opaque 3 15d
secret/nibbl-timescaledb-prod-pgbackrest Opaque 5 15d
secret/nibbl-timescaledb-prod-token-f9wvg kubernetes.io/service-account-token 3 15d
secret/sh.helm.release.v1.nibbl-timescaledb-prod.v1 helm.sh/release.v1 1 15d
secret/sh.helm.release.v1.nibbl-timescaledb-prod.v2 helm.sh/release.v1 1 15d

NAME DATA AGE ROLE configmap/kube-root-ca.crt 1 36d
configmap/nibbl-timescaledb-prod-patroni 1 15d
configmap/nibbl-timescaledb-prod-pgbackrest 1 15d
configmap/nibbl-timescaledb-prod-scripts 8 15d

NAME ENDPOINTS AGE ROLE endpoints/nibbl-timescaledb-prod 10.1.0.82:5432 15d
endpoints/nibbl-timescaledb-prod-backup 10.1.0.82:8081 15d
endpoints/nibbl-timescaledb-prod-config 10.1.0.182:8008,10.1.0.19:8008,10.1.0.82:8008 15d
endpoints/nibbl-timescaledb-prod-replica 10.1.0.182:5432,10.1.0.19:5432 15d replica

NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE ROLE persistentvolumeclaim/storage-volume-nibbl-timescaledb-prod-0 Bound pvc-8d7e46eb-33fa-4af2-a8c6-308a9555c230 500Gi RWO gp2 36d
persistentvolumeclaim/storage-volume-nibbl-timescaledb-prod-1 Bound pvc-a07d2be6-727d-492f-95e2-c488a02abcea 500Gi RWO gp2 36d
persistentvolumeclaim/storage-volume-nibbl-timescaledb-prod-2 Bound pvc-8dcfbbd1-1193-49e6-bce5-d23d330d83b6 500Gi RWO gp2 36d
persistentvolumeclaim/wal-volume-nibbl-timescaledb-prod-0 Bound pvc-1698a8f7-d4f7-414e-a047-69e3c30d4b50 500Gi RWO gp2 36d
persistentvolumeclaim/wal-volume-nibbl-timescaledb-prod-1 Bound pvc-b8d3bb6b-1dec-4ab5-bbda-53376fc55ce1 500Gi RWO gp2 36d
persistentvolumeclaim/wal-volume-nibbl-timescaledb-prod-2 Bound pvc-1baafc64-2136-4325-b7db-359262998288 500Gi RWO gp2 36d



**Logs**
timescaledb logs:

2022-08-14 05:15:32 UTC [510]: [62f87c7a.1fe-5456] postgres@nibbldb,app=[unknown] [00000] LOG:  statement: refresh materialized view mv_assets
ERROR: [103]: unable to find a valid repository:
       repo1: [FileMissingError] unable to load info file '/nibbl-timescaledb-prod/nibbl-timescaledb-prod/archive/poddb/archive.info' or '/nibbl-timescaledb-prod/nibbl-timescaledb-prod/archive/poddb/archi
ve.info.copy':
       FileMissingError: unable to open missing file '/nibbl-timescaledb-prod/nibbl-timescaledb-prod/archive/poddb/archive.info' for read
       FileMissingError: unable to open missing file '/nibbl-timescaledb-prod/nibbl-timescaledb-prod/archive/poddb/archive.info.copy' for read
       HINT: archive.info cannot be opened but is required to push/get WAL segments.
       HINT: is archive_command configured correctly in postgresql.conf?
       HINT: has a stanza-create been performed?
       HINT: use --no-archive-check to disable archive checks during backup if you have an alternate archiving scheme.
2022-08-14 05:15:33 UTC [93]: [62e3f925.5d-170208] @,app= [00000] LOG:  archive command failed with exit code 103
2022-08-14 05:15:33 UTC [93]: [62e3f925.5d-170209] @,app= [00000] DETAIL:  The failed archive command was: /etc/timescaledb/scripts/pgbackrest_archive.sh pg_wal/00000002.history
2022-08-14 05:15:33 UTC [510]: [62f87c7a.1fe-5457] postgres@nibbldb,app=[unknown] [00000] LOG:  statement: refresh materialized view mv_assets
2022-08-14 05:15:34 UTC [3515]: [62f884f6.dbb-1] [unknown]@[unknown],app=[unknown] [00000] LOG:  connection received: host=[local]
2022-08-14 05:15:34 UTC [3515]: [62f884f6.dbb-2] postgres@postgres,app=[unknown] [00000] LOG:  connection authenticated: identity="postgres" method=peer (/var/lib/postgresql/data/pg_hba.conf:3)
2022-08-14 05:15:34 UTC [3515]: [62f884f6.dbb-3] postgres@postgres,app=[unknown] [00000] LOG:  connection authorized: user=postgres database=postgres application_name=pg_isready
2022-08-14 05:15:34 UTC [3515]: [62f884f6.dbb-4] postgres@postgres,app=pg_isready [00000] LOG:  disconnection: session time: 0:00:00.005 user=postgres database=postgres host=[local]
ERROR: [103]: unable to find a valid repository:
       repo1: [FileMissingError] unable to load info file '/nibbl-timescaledb-prod/nibbl-timescaledb-prod/archive/poddb/archive.info' or '/nibbl-timescaledb-prod/nibbl-timescaledb-prod/archive/poddb/archi
ve.info.copy':
       FileMissingError: unable to open missing file '/nibbl-timescaledb-prod/nibbl-timescaledb-prod/archive/poddb/archive.info' for read
       FileMissingError: unable to open missing file '/nibbl-timescaledb-prod/nibbl-timescaledb-prod/archive/poddb/archive.info.copy' for read
       HINT: archive.info cannot be opened but is required to push/get WAL segments.
       HINT: is archive_command configured correctly in postgresql.conf?
       HINT: has a stanza-create been performed?
       HINT: use --no-archive-check to disable archive checks during backup if you have an alternate archiving scheme.
2022-08-14 05:15:34 UTC [93]: [62e3f925.5d-170210] @,app= [00000] LOG:  archive command failed with exit code 103

pgbackrest logs:

2022-08-14 02:12:26,420 - ERROR - backup - Backup 20220814021226 failed with returncode 55
2022-08-14 02:12:26,422 - INFO - history - Refreshing backup history using pgbackrest
2022-08-14 02:12:26,457 - INFO - http - 10.1.0.71 - - "POST /backups/ HTTP/1.1" 500 -
2022-08-14 02:12:27,422 - DEBUG - backup - Waiting until backup triggered
2022-08-14 02:12:56,372 - INFO - backup - Starting backup
2022-08-14 02:12:56,426 - ERROR - backup - ERROR: [055]: unable to load info file '/nibbl-timescaledb-prod/nibbl-timescaledb-prod/backup/poddb/backup.info' or '/nibbl-timescaledb-prod/nibbl-timescaledb-prod/backup/poddb/backup.info.copy':
2022-08-14 02:12:56,426 - INFO - backup -        FileMissingError: unable to open missing file '/nibbl-timescaledb-prod/nibbl-timescaledb-prod/backup/poddb/backup.info' for read
2022-08-14 02:12:56,426 - INFO - backup -        FileMissingError: unable to open missing file '/nibbl-timescaledb-prod/nibbl-timescaledb-prod/backup/poddb/backup.info.copy' for read
2022-08-14 02:12:56,426 - INFO - backup -        HINT: backup.info cannot be opened and is required to perform a backup.
2022-08-14 02:12:56,426 - INFO - backup -        HINT: has a stanza-create been performed?
2022-08-14 02:12:56,429 - DEBUG - backup - Backup details
{
    "age": 0.0,
    "duration": 0.0,
    "finished": "2022-08-14T02:12:56+00:00",
    "label": "20220814021256",
    "pgbackrest": {},
    "pid": 465,
    "returncode": 55,
    "started": "2022-08-14T02:12:56+00:00",
    "status": "RUNNING"
}

**Additional context**
Add any other context about the problem here.
Codestar0609 commented 1 year ago

fixed

darthsiderius commented 1 year ago

fixed

Can you provide any details how you fixed it? (Having same issue myself)

kamilcglr commented 1 year ago

fixed

Can you provide any details how you fixed it? (Having same issue myself)

Hello @darthsiderius did you find a solution? same issue in my side. @Codestar0609? Thanks in advance

thannaske commented 1 year ago

@Codestar0609 It would be helpful if you could share your solution.

kamilcglr commented 1 year ago

fixed

Can you provide any details how you fixed it? (Having same issue myself)

Hello @darthsiderius did you find a solution? same issue in my side. @Codestar0609? Thanks in advance

Hello, in our case it is the credentials in the values.yaml that should not be in quotes. Beginner's mistake 😅