Open sigalits opened 3 years ago
i have added to the .ssh/config on each nodes those options : "Host * StrictHostKeyChecking no UserKnownHostsFile /dev/null"
Sadly it still failes on the same erros.
while this ssh command works fine :
Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.XXX.XX.226 9.89 MiB 256 47.7% 359ca6a6-b945-4435-a304-8b2b9b2f3815 1a UN 10.XXX.XX.130 9.5 MiB 256 52.3% e04458d5-42e7-45a9-bb96-a6a7be3791ad 1a UN 10.XXX.XX.54 11.14 MiB 256 53.1% a765de06-2132-4bcf-b6f6-ae8331b13f55 1b UN 10.XXX.XX.202 9.71 MiB 256 46.9% df6d8ada-7620-4d5a-8590-056734945e15 1b UN 10.XXX.XX.123 9.67 MiB 256 46.0% f3830c26-f5c1-4015-aafa-fa878b9348c4 1c UN 10.XXX.XX.11 10.5 MiB 256 54.0% 45e6447c-87db-4d4b-97ae-e1fcfb2c2b0e 1c
Can anyone please update if this issue is fixes getting the issue while running the backup of cluster, the code works fine during single node backup
@LO764640 @sigalits @adejanovski I used cassandra tarball setup and found medusa having issues to pick paths properly while using backup-cluster option. I followed below procedure to make medusa work. Also make sure /etc/hosts and dns is set properly when resolve_ip_addresses is used in medusa.ini. Hope it helps.
1) Set medusa/medusa-wrapper/nodetool/cqlsh paths.
$ sudo visudo Defaults secure_path = /sbin:/bin:/usr/sbin:/usr/bin:/usr/local/bin:/opt/cassandra/bin
2) Set cassandra conf, cassandra libraries.
$ vi /etc/environment
CASSANDRA_CONF=/opt/cassandra/conf/ CLASSPATH=/opt/cassandra/lib/
with the help @sandeepmallik solution we solved /bin/bash: nodetool: command not found issue.
Now we are geting error during upload. as shown below.
[2022-05-16 13:34:34,494] INFO: Monitoring provider is noop [2022-05-16 13:34:35,450] INFO: No backups found in index. Consider running "medusa build-index" if you have some backups [2022-05-16 13:34:35,450] INFO: Starting backup stage-medusa-backup-16May2022 [2022-05-16 13:34:35,458] WARNING: is ccm : 0 [2022-05-16 13:34:35,671] INFO: Creating snapshots on all nodes [2022-05-16 13:34:38,699] INFO: A snapshot medusa-stage-medusa-backup-16May2022 was created on all nodes. [2022-05-16 13:34:38,699] INFO: Uploading snapshots from nodes to external storage [2022-05-16 13:34:38,700] INFO: Executing "mkdir -p /tmp/medusa-job-e2b82637-c379-47ab-88fb-2ccca7373b86; cd /tmp/medusa-job-e2b82637-c379-47ab-88fb-2ccca7373b86 && medusa-wrapper sudo medusa -vvv backup-node --backup-name stage-medusa-backup-16May2022 --mode differential" on following nodes ['cassandra.iviws.local', 'cas03.iviws.local', 'cas04.iviws.local', 'cas02.iviws.local'] with a parallelism/pool size of 1 [2022-05-16 13:34:41,143] ERROR: Job executing "mkdir -p /tmp/medusa-job-e2b82637-c379-47ab-88fb-2ccca7373b86; cd /tmp/medusa-job-e2b82637-c379-47ab-88fb-2ccca7373b86 && medusa-wrapper sudo medusa -vvv backup-node --backup-name stage-medusa-backup-16May2022 --mode differential" ran and finished with errors on following nodes: ['cas02.iviws.local', 'cas03.iviws.local', 'cas04.iviws.local', 'cassandra.iviws.local'] [2022-05-16 13:34:41,145] ERROR: Some nodes failed to upload the backup. [2022-05-16 13:34:41,145] ERROR: This error happened during the cluster backup: Some nodes failed to upload the backup.
the stderris showing as below:
$cat /tmp/medusa-job-e2b82637-c379-47ab-88fb-2ccca7373b86/stderr
Traceback (most recent call last):
File "/home/ec2-user/.local/bin/medusa", line 5, in
Hi @ajit-devops-2008 , I know it's been too long, but I'm grooming tickets now and stumbled upon this. Did you manage to solve the issues? Is there anything we can help with?
Project board link
Trying to run the new backup-cluster option, and i get failures on the ssh key : [cassandra-dba-dev]/home/cassandra >medusa backup-cluster --backup-name test1 INFO: Monitoring provider is noop INFO: Starting backup test1 WARNING: is ccm : 0 INFO: Creating snapshots on all nodes INFO: Executing "nodetool snapshot -t medusa-test1" on following nodes ['ip-10-XXX-XX-130.ec2.internal', 'ip-10-XXX-39-XXX.ec2.internal', 'ip-10-XXX-XX-202.ec2.internal', 'ip-10-XXX-XX-54.ec2.internal', 'ip-10--XXX-XX-11.ec2.internal', 'ip-10-XXX-XX-123.ec2.internal'] with a parallelism/pool size of 500 [2021-01-12 10:12:45,824] ERROR: Job executing "nodetool snapshot -t medusa-test1" ran and finished with errors on following nodes: ['ip-10--XXX-XX-130.ec2.internal', 'ip-10--XXX-XX-226.ec2.internal', 'ip-10--XXX-XX-202.ec2.internal', 'ip-10--XXX-XX-54.ec2.internal', 'ip-10-XXX-XX-11.ec2.internal', 'ip-10-162-43-123.ec2.internal'] [2021-01-12 10:12:45,825] INFO: [ip-10-XXX-XX-130.ec2.internal] /bin/bash: nodetool: command not found [2021-01-12 10:12:45,825] INFO: ip-10-XXX-XX-130.ec2.internal-stdout: /bin/bash: nodetool: command not found [2021-01-12 10:12:45,825] INFO: [ip-10-XXX-XX-226.ec2.internal] /bin/bash: nodetool: command not found [2021-01-12 10:12:45,825] INFO: ip-10-XXX-XX-226.ec2.internal-stdout: /bin/bash: nodetool: command not found [2021-01-12 10:12:45,825] INFO: [ip-10-XXX-XX-202.ec2.internal] /bin/bash: nodetool: command not found [2021-01-12 10:12:45,825] INFO: ip-10-XXX-XX-202.ec2.internal-stdout: /bin/bash: nodetool: command not found [2021-01-12 10:12:45,826] INFO: [ip-10-XXX-XX-54.ec2.internal] /bin/bash: nodetool: command not found [2021-01-12 10:12:45,826] INFO: ip-10-XXX-XX-54.ec2.internal-stdout: /bin/bash: nodetool: command not found [2021-01-12 10:12:45,826] INFO: [ip-10-XXX-XX-11.ec2.internal] /bin/bash: nodetool: command not found [2021-01-12 10:12:45,826] INFO: ip-10-XXX-XX-11.ec2.internal-stdout: /bin/bash: nodetool: command not found [2021-01-12 10:12:45,826] INFO: [ip-10-XXX-XX-123.ec2.internal] /bin/bash: nodetool: command not found [2021-01-12 10:12:45,826] INFO: ip-10-XXX-XX-123.ec2.internal-stdout: /bin/bash: nodetool: command not found
Tried also specifying the username and key in the command , but got the same errors.
connecting using simple ssh to each of the servers resolves that : ssh ip-10-XXX-XX-11.ec2.internal The authenticity of host 'ip-10--XXX-XX-11.ec2.internal (10.XXX.XX.11)' can't be established. ECDSA key fingerprint is SHA256:Z+MmNEkzuWkcUihkWKt/aY4iNje7sywPTzEjcum3g/A. ECDSA key fingerprint is MD5:37:43:f1:f6:28:7a:7d:c3:85:62:60:70:eb:d3:b8:cb. Are you sure you want to continue connecting (yes/no)? yes
meaning that the key is ok ,
so maybe the ssh command used by medusa is missing those options : ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no which does allows me to connect
Thanks Sigalit
┆Issue is synchronized with this Jira Story by Unito