k8ssandra / k8ssandra-operator

The Kubernetes operator for K8ssandra
https://k8ssandra.io/
Apache License 2.0
174 stars 79 forks source link

MedusaBackupJob is not finishing at all in v1.18.0 of k8ssandra-operator #1380

Open saranya-nallasamy opened 3 months ago

saranya-nallasamy commented 3 months ago

Hi Team,

We are seeing MedusaBackupJob is not finishing at all. It is starting as per schedule but does not finish at all and hence the next day job is not starting at all. There are no events/logs generated as part of MedusaBackupJob also. Hence, we are not able to trace out exact problem associated with this.

The below is the medusa config that we have as of now. Please note, we have replaced some confidential details with *** in the below config. Kindly help us on how to find exact problem in this case.

medusa: containerImage: registry: **

if we don't set this, the "latest" version is used, so check the k8ssandra-operator

  tag: 0.22.0
storageProperties:
  # Can be either of local, google_storage, azure_blobs, s3, s3_compatible, s3_rgw or ibm_storage
  storageProvider: s3_compatible
  # Name of the secret containing the credentials file to access the backup storage backend
  storageSecretRef:
    name: medusa-bucket-key
  # Name of the storage bucket
  bucketName: test-bucket
  # Prefix for this cluster in the storage bucket directory structure, used for multitenancy
  prefix: cassandra-$PARAM_UC
  # Host to connect to the storage backend (Omitted for GCS, S3, Azure and local).
  host: *******************
  # Port to connect to the storage backend (Omitted for GCS, S3, Azure and local).
  # port: 9000
  # Region of the storage bucket
  # region: us-east-1
  # Whether or not to use SSL to connect to the storage backend
  secure: true
  # Maximum backup age that the purge process should observe.
  # 0 equals unlimited
  #maxBackupAge: 1
  # Maximum number of backups to keep (used by the purge process).
  # 0 equals unlimited
  maxBackupCount: 3
  # AWS Profile to use for authentication.
  # apiProfile:
  # transferMaxBandwidth: 50MB/s
  # Number of concurrent uploads.
  # Helps maximizing the speed of uploads but puts more pressure on the network.
  # Defaults to 1.
  # concurrentTransfers: 1
  # File size in bytes over which cloud specific cli tools are used for transfer.
  # Defaults to 100 MB.
  # multiPartUploadThreshold: 104857600
  # Age after which orphan sstables can be deleted from the storage backend.
  # Protects from race conditions between purge and ongoing backups.
  # Defaults to 10 days.
  # backupGracePeriodInDays: 10
  # Pod storage settings to use for local storage (testing only)
  # podStorage:
  #   storageClassName: standard
  #   accessModes:
  #     - ReadWriteOnce
  #   size: 100Mi
securityContext:
  allowPrivilegeEscalation: false
  readOnlyRootFilesystem: false
  privileged: false
  capabilities:
    drop:
    - ALL
    - CAP_NET_RAW

Thanks.

┆Issue is synchronized with this Jira Story by Unito ┆Issue Number: K8OP-7