Closed tlb1galaxy closed 4 months ago
Possible Resolution: After spending a bunch of time on this, I finally have gotten the 'backup-cluster' function to work. Here are the conditions I had to implement to get this to work.
SSH-agent and forwarding:
SSH key-auth:
SUDOERS - secure_path: Reference an existing issue: - issue#253
Have to modify the line in /etc/sudoers via visudo add the 2 following paths to 'secure_paths'
# Adding HOME to env_keep may enable a user to run unrestricted
# commands via sudo.
#
# Defaults env_keep += "HOME"
Defaults secure_path = /sbin:/bin:/usr/sbin:/usr/bin:/usr/local/bin:/opt/cassandra/bin
**/etc/medusa/medusa.ini** - handle SSH keys:
The default example medusa.ini has 2 keys with the same name 'key_file'
- [storage] - key_file
- [ssh] - key_file
You need to ensure only one is active:
```ini
[storage]
storage_provider = local
; storage_provider should be either of "local", "google_storage" or "s3"
; region = <Region hosting the storage>
; Name of the bucket used for storing backups
bucket_name = cassandra_backups
; JSON key file for service account with access to GCS bucket or AWS credentials file (home-dir/.aws/credentials)
; key_file = /etc/medusa/credentials
; Path of the local storage bucket (used only with 'local' storage provider)
base_path = /exports/compassfile01/lvm_backup01/backups/compasscass
; Any prefix used for multitenancy in the same bucket
prefix = tlb1.compass_cassandra01_rack01
;fqdn = <enforce the name of the local node. Computed automatically if not provided.>
; Number of days before backups are purged. 0 means backups don't get purged by age (default)
max_backup_age = 15
; Number of backups to retain. Older backups will get purged beyond that number. 0 means backups don't get purged by count (default)
max_backup_count = 0
; Both thresholds can be defined for backup purge.
; Used to throttle S3 backups/restores:
transfer_max_bandwidth = 50MB/s
; Max number of downloads/uploads. Not used by the GCS backend.
concurrent_transfers = 1
; Size over which S3 uploads will be using the awscli with multi part uploads. Defaults to 100MB.
multi_part_upload_threshold = 104857600
; GC grace period for backed up files. Prevents race conditions between purge and running backups
backup_grace_period_in_days = 10
[ssh]
username = root
key_file = /root/.ssh/id_rsa
;port = <SSH port for use for restoring clusters. Default to port 22.
;cert_file = <Path of public key signed certificate file to use for authentication. The corresponding private key must also be provided via key_file parameter>
I was not able to reproduce this.
The [ssh]
section needs the username/password. It might be nice to defualt to $USER and ~/.ssh/id_rsa, but that's perhaps for another issue.
The clash of cassandra/key_file and ssh/key_file does not seem to be a thing either.
Project board link
Hello, I am trying to backup a new Cassandra cluster (4 x node of CentOS7) using local storage (NFS mounts shared by all nodes) and all forms of authentication seems to fail.
Have SSH-auth configured between all the nodes. Have enabled and populated ssh-agent (even-though I cannot find any documentation referencing this as a requirement)
ENVIRONMENT: Cassandra version:
Cassandra status:
Python:
Medusa:
PIP packages:
OS:
Mounts:
SSH auth:
SSH-agent:
ERRORS:
Medusa command:
Target node - /var/log/secure:
Source node - /var/log/secure:
Cassandra.yaml:
Medusa.ini:
┆Issue is synchronized with this Jira Task by Unito ┆friendlyId: K8SSAND-1480 ┆priority: Medium