Open th2zz opened 1 year ago
Hi @th2zz !
This seems really odd. There is code to not create a snapshot if it already exists. So the bug is not trivial.
It's also weird the node-local log complains about a different backup than the cluster-wide one.
But I should have enough info, so I'll try to reproduce this next week.
Hi again @th2zz !
I've now spent some time trying to reproduce this issue. Sadly (or luckily), I did not manage to.
The only time I've managed to get a backup already exists error was when I ctrl+c'ed an execution and immediatelly started a new one. This was when I did not supply a --backup-name
option to the backup-cluster
command, which made Medusa generate one. The format for this is backup-YYYYMMDDHHMM
, so if this happens twice in one minute, then there's going to be clash.
This is still different from the reported issue of already existing snapshots. The snapshot clearing code we have seems to be fairly robust and clears snapshots even if the backup fails. Having an existing snapshot but not an exisitng backup is possible, but requires a really peculiar sequence of events to happen, and as such it's really unlikely to happen.
I can't tell from the original report what that sequence is. To be frank, I think the log traces come from two separate occasions.
To wrap up. Please make sure that:
/tmp/cassandra_backups
has nothing in it)nodetool listsnapshots
finds nothing)ls /tmp/medusa-job-*
finds nothing)If you're still seeing the issue, then please re-share the logs from medusa-job-*
folder and the trace the backup-cluster
command prints into the console.
Hi I have very similar issue:
[2023-12-13 20:25:12,902] INFO: Resolving ip address
[2023-12-13 20:25:12,902] INFO: ip address to resolve 10.86.2.37
[2023-12-13 20:25:12,904] INFO: Monitoring provider is noop
[2023-12-13 20:25:12,915] INFO: Using credentials CensoredCredentials(access_key_id=m..a, secret_access_key=*****, region=us-east-1)
[2023-12-13 20:25:12,915] INFO: Using S3 URL https://s3server0101.server.lan:9000
[2023-12-13 20:25:13,073] INFO: Starting backup 202312132025
[2023-12-13 20:25:13,144] WARNING: ssl_storage_port is deprecated as of Apache Cassandra 4.x
[2023-12-13 20:25:13,605] INFO: Resolving ip address 10.86.2.37
[2023-12-13 20:25:13,606] INFO: ip address to resolve 10.86.2.37
[2023-12-13 20:25:13,607] INFO: Resolving ip address 10.86.2.25
[2023-12-13 20:25:13,608] INFO: ip address to resolve 10.86.2.25
[2023-12-13 20:25:13,611] INFO: Resolving ip address 10.86.2.37
[2023-12-13 20:25:13,611] INFO: ip address to resolve 10.86.2.37
[2023-12-13 20:25:13,612] INFO: Resolving ip address 10.86.2.53
[2023-12-13 20:25:13,612] INFO: ip address to resolve 10.86.2.53
[2023-12-13 20:25:13,619] INFO: Creating snapshots on all nodes
[2023-12-13 20:25:13,619] INFO: Executing "nodetool -Dcom.sun.jndi.rmiURLParsing=legacy --ssl -u USER -pw PASSWORD -h server0101.server.lan -p 7199 snapshot -t medusa-202312132025" on following nodes ['server0102.server.lan', 'server0101.server.lan', 'server0103.server.lan'] with a parallelism/pool size of 500
[2023-12-13 20:25:17,465] ERROR: Job executing "nodetool -Dcom.sun.jndi.rmiURLParsing=legacy --ssl -u USER -pw PASSWORD -h server0101.server.lan -p 7199 snapshot -t medusa-202312132025" ran and finished with errors on following nodes: ['server0101.server.lan', 'server0103.server.lan']
[2023-12-13 20:25:17,466] INFO: [server0101.server.lan] Requested creating snapshot(s) for [all keyspaces] with snapshot name [medusa-202312132025] and options {skipFlush=false}
[2023-12-13 20:25:17,467] INFO: server0101.server.lan-stdout: Requested creating snapshot(s) for [all keyspaces] with snapshot name [medusa-202312132025] and options {skipFlush=false}
[2023-12-13 20:25:17,467] INFO: [server0101.server.lan] [err] error: Snapshot medusa-202312132025 already exists.
[2023-12-13 20:25:17,468] INFO: server0101.server.lan-stderr: error: Snapshot medusa-202312132025 already exists.
[2023-12-13 20:25:17,468] INFO: [server0101.server.lan] [err] -- StackTrace --
[2023-12-13 20:25:17,469] INFO: server0101.server.lan-stderr: -- StackTrace --
[2023-12-13 20:25:17,469] INFO: [server0101.server.lan] [err] java.io.IOException: Snapshot medusa-202312132025 already exists.
[2023-12-13 20:25:17,470] INFO: server0101.server.lan-stderr: java.io.IOException: Snapshot medusa-202312132025 already exists.
[2023-12-13 20:25:17,470] INFO: [server0101.server.lan] [err] at org.apache.cassandra.service.StorageService.takeSnapshot(StorageService.java:3994)
[2023-12-13 20:25:17,471] INFO: server0101.server.lan-stderr: at org.apache.cassandra.service.StorageService.takeSnapshot(StorageService.java:3994)
[2023-12-13 20:25:17,472] INFO: [server0101.server.lan] [err] at org.apache.cassandra.service.StorageService.takeSnapshot(StorageService.java:3908)
[2023-12-13 20:25:17,472] INFO: server0101.server.lan-stderr: at org.apache.cassandra.service.StorageService.takeSnapshot(StorageService.java:3908)
[2023-12-13 20:25:17,473] INFO: [server0101.server.lan] [err] at jdk.internal.reflect.GeneratedMethodAccessor39.invoke(Unknown Source)
[2023-12-13 20:25:17,473] INFO: server0101.server.lan-stderr: at jdk.internal.reflect.GeneratedMethodAccessor39.invoke(Unknown Source)
[2023-12-13 20:25:17,474] INFO: [server0101.server.lan] [err] at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[2023-12-13 20:25:17,475] INFO: server0101.server.lan-stderr: at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[2023-12-13 20:25:17,475] INFO: [server0101.server.lan] [err] at java.base/java.lang.reflect.Method.invoke(Method.java:566)
[2023-12-13 20:25:17,476] INFO: server0101.server.lan-stderr: at java.base/java.lang.reflect.Method.invoke(Method.java:566)
[2023-12-13 20:25:17,476] INFO: [server0101.server.lan] [err] at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71)
[2023-12-13 20:25:17,477] INFO: server0101.server.lan-stderr: at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71)
[2023-12-13 20:25:17,477] INFO: [server0101.server.lan] [err] at jdk.internal.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
[2023-12-13 20:25:17,477] INFO: server0101.server.lan-stderr: at jdk.internal.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
[2023-12-13 20:25:17,477] INFO: [server0101.server.lan] [err] at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[2023-12-13 20:25:17,478] INFO: server0101.server.lan-stderr: at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[2023-12-13 20:25:17,478] INFO: [server0101.server.lan] [err] at java.base/java.lang.reflect.Method.invoke(Method.java:566)
[2023-12-13 20:25:17,478] INFO: server0101.server.lan-stderr: at java.base/java.lang.reflect.Method.invoke(Method.java:566)
[2023-12-13 20:25:17,478] INFO: [server0101.server.lan] [err] at java.base/sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:260)
[2023-12-13 20:25:17,479] INFO: server0101.server.lan-stderr: at java.base/sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:260)
[2023-12-13 20:25:17,479] INFO: [server0101.server.lan] [err] at java.management/com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
[2023-12-13 20:25:17,479] INFO: server0101.server.lan-stderr: at java.management/com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
[2023-12-13 20:25:17,479] INFO: [server0101.server.lan] [err] at java.management/com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
[2023-12-13 20:25:17,480] INFO: server0101.server.lan-stderr: at java.management/com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
[2023-12-13 20:25:17,480] INFO: [server0101.server.lan] [err] at java.management/com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
[2023-12-13 20:25:17,480] INFO: server0101.server.lan-stderr: at java.management/com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
[2023-12-13 20:25:17,480] INFO: [server0101.server.lan] [err] at java.management/com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
[2023-12-13 20:25:17,481] INFO: server0101.server.lan-stderr: at java.management/com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
[2023-12-13 20:25:17,481] INFO: [server0101.server.lan] [err] at java.management/com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
[2023-12-13 20:25:17,481] INFO: server0101.server.lan-stderr: at java.management/com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
[2023-12-13 20:25:17,481] INFO: [server0101.server.lan] [err] at java.management/com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:809)
[2023-12-13 20:25:17,482] INFO: server0101.server.lan-stderr: at java.management/com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:809)
[2023-12-13 20:25:17,482] INFO: [server0101.server.lan] [err] at java.management/com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
[2023-12-13 20:25:17,482] INFO: server0101.server.lan-stderr: at java.management/com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
[2023-12-13 20:25:17,482] INFO: [server0101.server.lan] [err] at java.management/com.sun.jmx.remote.security.MBeanServerAccessController.invoke(MBeanServerAccessController.java:468)
[2023-12-13 20:25:17,483] INFO: server0101.server.lan-stderr: at java.management/com.sun.jmx.remote.security.MBeanServerAccessController.invoke(MBeanServerAccessController.java:468)
[2023-12-13 20:25:17,483] INFO: [server0101.server.lan] [err] at java.management.rmi/javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1466)
[2023-12-13 20:25:17,483] INFO: server0101.server.lan-stderr: at java.management.rmi/javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1466)
[2023-12-13 20:25:17,483] INFO: [server0101.server.lan] [err] at java.management.rmi/javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1307)
[2023-12-13 20:25:17,484] INFO: server0101.server.lan-stderr: at java.management.rmi/javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1307)
[2023-12-13 20:25:17,484] INFO: [server0101.server.lan] [err] at java.base/java.security.AccessController.doPrivileged(Native Method)
[2023-12-13 20:25:17,484] INFO: server0101.server.lan-stderr: at java.base/java.security.AccessController.doPrivileged(Native Method)
[2023-12-13 20:25:17,484] INFO: [server0101.server.lan] [err] at java.management.rmi/javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1406)
[2023-12-13 20:25:17,485] INFO: server0101.server.lan-stderr: at java.management.rmi/javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1406)
[2023-12-13 20:25:17,485] INFO: [server0101.server.lan] [err] at java.management.rmi/javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:827)
[2023-12-13 20:25:17,485] INFO: server0101.server.lan-stderr: at java.management.rmi/javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:827)
[2023-12-13 20:25:17,485] INFO: [server0101.server.lan] [err] at java.base/jdk.internal.reflect.GeneratedMethodAccessor36.invoke(Unknown Source)
[2023-12-13 20:25:17,485] INFO: server0101.server.lan-stderr: at java.base/jdk.internal.reflect.GeneratedMethodAccessor36.invoke(Unknown Source)
[2023-12-13 20:25:17,486] INFO: [server0101.server.lan] [err] at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[2023-12-13 20:25:17,486] INFO: server0101.server.lan-stderr: at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[2023-12-13 20:25:17,486] INFO: [server0101.server.lan] [err] at java.base/java.lang.reflect.Method.invoke(Method.java:566)
[2023-12-13 20:25:17,486] INFO: server0101.server.lan-stderr: at java.base/java.lang.reflect.Method.invoke(Method.java:566)
[2023-12-13 20:25:17,487] INFO: [server0101.server.lan] [err] at java.rmi/sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:359)
[2023-12-13 20:25:17,487] INFO: server0101.server.lan-stderr: at java.rmi/sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:359)
[2023-12-13 20:25:17,487] INFO: [server0101.server.lan] [err] at java.rmi/sun.rmi.transport.Transport$1.run(Transport.java:200)
[2023-12-13 20:25:17,487] INFO: server0101.server.lan-stderr: at java.rmi/sun.rmi.transport.Transport$1.run(Transport.java:200)
[2023-12-13 20:25:17,487] INFO: [server0101.server.lan] [err] at java.rmi/sun.rmi.transport.Transport$1.run(Transport.java:197)
[2023-12-13 20:25:17,488] INFO: server0101.server.lan-stderr: at java.rmi/sun.rmi.transport.Transport$1.run(Transport.java:197)
[2023-12-13 20:25:17,488] INFO: [server0101.server.lan] [err] at java.base/java.security.AccessController.doPrivileged(Native Method)
[2023-12-13 20:25:17,488] INFO: server0101.server.lan-stderr: at java.base/java.security.AccessController.doPrivileged(Native Method)
[2023-12-13 20:25:17,488] INFO: [server0101.server.lan] [err] at java.rmi/sun.rmi.transport.Transport.serviceCall(Transport.java:196)
[2023-12-13 20:25:17,488] INFO: server0101.server.lan-stderr: at java.rmi/sun.rmi.transport.Transport.serviceCall(Transport.java:196)
[2023-12-13 20:25:17,488] INFO: [server0101.server.lan] [err] at java.rmi/sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:562)
[2023-12-13 20:25:17,488] INFO: server0101.server.lan-stderr: at java.rmi/sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:562)
[2023-12-13 20:25:17,489] INFO: [server0101.server.lan] [err] at java.rmi/sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:796)
[2023-12-13 20:25:17,489] INFO: server0101.server.lan-stderr: at java.rmi/sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:796)
[2023-12-13 20:25:17,489] INFO: [server0101.server.lan] [err] at java.rmi/sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:677)
[2023-12-13 20:25:17,489] INFO: server0101.server.lan-stderr: at java.rmi/sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:677)
[2023-12-13 20:25:17,489] INFO: [server0101.server.lan] [err] at java.base/java.security.AccessController.doPrivileged(Native Method)
[2023-12-13 20:25:17,489] INFO: server0101.server.lan-stderr: at java.base/java.security.AccessController.doPrivileged(Native Method)
[2023-12-13 20:25:17,490] INFO: [server0101.server.lan] [err] at java.rmi/sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:676)
[2023-12-13 20:25:17,490] INFO: server0101.server.lan-stderr: at java.rmi/sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:676)
[2023-12-13 20:25:17,490] INFO: [server0101.server.lan] [err] at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
[2023-12-13 20:25:17,490] INFO: server0101.server.lan-stderr: at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
[2023-12-13 20:25:17,490] INFO: [server0101.server.lan] [err] at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
[2023-12-13 20:25:17,490] INFO: server0101.server.lan-stderr: at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
[2023-12-13 20:25:17,490] INFO: [server0101.server.lan] [err] at java.base/java.lang.Thread.run(Thread.java:829)
[2023-12-13 20:25:17,491] INFO: server0101.server.lan-stderr: at java.base/java.lang.Thread.run(Thread.java:829)
[2023-12-13 20:25:17,491] INFO: [server0101.server.lan] [err]
[2023-12-13 20:25:17,491] INFO: server0101.server.lan-stderr:
[2023-12-13 20:25:17,491] INFO: [server0103.server.lan] Requested creating snapshot(s) for [all keyspaces] with snapshot name [medusa-202312132025] and options {skipFlush=false}
[2023-12-13 20:25:17,491] INFO: server0103.server.lan-stdout: Requested creating snapshot(s) for [all keyspaces] with snapshot name [medusa-202312132025] and options {skipFlush=false}
[2023-12-13 20:25:17,491] INFO: [server0103.server.lan] [err] error: Snapshot medusa-202312132025 already exists.
[2023-12-13 20:25:17,492] INFO: server0103.server.lan-stderr: error: Snapshot medusa-202312132025 already exists.
[2023-12-13 20:25:17,492] INFO: [server0103.server.lan] [err] -- StackTrace --
[2023-12-13 20:25:17,492] INFO: server0103.server.lan-stderr: -- StackTrace --
[2023-12-13 20:25:17,492] INFO: [server0103.server.lan] [err] java.io.IOException: Snapshot medusa-202312132025 already exists.
[2023-12-13 20:25:17,492] INFO: server0103.server.lan-stderr: java.io.IOException: Snapshot medusa-202312132025 already exists.
[2023-12-13 20:25:17,492] INFO: [server0103.server.lan] [err] at org.apache.cassandra.service.StorageService.takeSnapshot(StorageService.java:3994)
[2023-12-13 20:25:17,492] INFO: server0103.server.lan-stderr: at org.apache.cassandra.service.StorageService.takeSnapshot(StorageService.java:3994)
[2023-12-13 20:25:17,493] INFO: [server0103.server.lan] [err] at org.apache.cassandra.service.StorageService.takeSnapshot(StorageService.java:3908)
[2023-12-13 20:25:17,493] INFO: server0103.server.lan-stderr: at org.apache.cassandra.service.StorageService.takeSnapshot(StorageService.java:3908)
[2023-12-13 20:25:17,493] INFO: [server0103.server.lan] [err] at jdk.internal.reflect.GeneratedMethodAccessor39.invoke(Unknown Source)
[2023-12-13 20:25:17,493] INFO: server0103.server.lan-stderr: at jdk.internal.reflect.GeneratedMethodAccessor39.invoke(Unknown Source)
[2023-12-13 20:25:17,493] INFO: [server0103.server.lan] [err] at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[2023-12-13 20:25:17,493] INFO: server0103.server.lan-stderr: at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[2023-12-13 20:25:17,494] INFO: [server0103.server.lan] [err] at java.base/java.lang.reflect.Method.invoke(Method.java:566)
[2023-12-13 20:25:17,494] INFO: server0103.server.lan-stderr: at java.base/java.lang.reflect.Method.invoke(Method.java:566)
[2023-12-13 20:25:17,494] INFO: [server0103.server.lan] [err] at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71)
[2023-12-13 20:25:17,494] INFO: server0103.server.lan-stderr: at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71)
[2023-12-13 20:25:17,494] INFO: [server0103.server.lan] [err] at jdk.internal.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
[2023-12-13 20:25:17,494] INFO: server0103.server.lan-stderr: at jdk.internal.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
[2023-12-13 20:25:17,494] INFO: [server0103.server.lan] [err] at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[2023-12-13 20:25:17,495] INFO: server0103.server.lan-stderr: at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[2023-12-13 20:25:17,495] INFO: [server0103.server.lan] [err] at java.base/java.lang.reflect.Method.invoke(Method.java:566)
[2023-12-13 20:25:17,495] INFO: server0103.server.lan-stderr: at java.base/java.lang.reflect.Method.invoke(Method.java:566)
[2023-12-13 20:25:17,495] INFO: [server0103.server.lan] [err] at java.base/sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:260)
[2023-12-13 20:25:17,495] INFO: server0103.server.lan-stderr: at java.base/sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:260)
[2023-12-13 20:25:17,495] INFO: [server0103.server.lan] [err] at java.management/com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
[2023-12-13 20:25:17,496] INFO: server0103.server.lan-stderr: at java.management/com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
[2023-12-13 20:25:17,496] INFO: [server0103.server.lan] [err] at java.management/com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
[2023-12-13 20:25:17,496] INFO: server0103.server.lan-stderr: at java.management/com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
[2023-12-13 20:25:17,496] INFO: [server0103.server.lan] [err] at java.management/com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
[2023-12-13 20:25:17,496] INFO: server0103.server.lan-stderr: at java.management/com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
[2023-12-13 20:25:17,496] INFO: [server0103.server.lan] [err] at java.management/com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
[2023-12-13 20:25:17,497] INFO: server0103.server.lan-stderr: at java.management/com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
[2023-12-13 20:25:17,497] INFO: [server0103.server.lan] [err] at java.management/com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
[2023-12-13 20:25:17,497] INFO: server0103.server.lan-stderr: at java.management/com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
[2023-12-13 20:25:17,497] INFO: [server0103.server.lan] [err] at java.management/com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:809)
[2023-12-13 20:25:17,497] INFO: server0103.server.lan-stderr: at java.management/com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:809)
[2023-12-13 20:25:17,497] INFO: [server0103.server.lan] [err] at java.management/com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
[2023-12-13 20:25:17,497] INFO: server0103.server.lan-stderr: at java.management/com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
[2023-12-13 20:25:17,498] INFO: [server0103.server.lan] [err] at java.management/com.sun.jmx.remote.security.MBeanServerAccessController.invoke(MBeanServerAccessController.java:468)
[2023-12-13 20:25:17,498] INFO: server0103.server.lan-stderr: at java.management/com.sun.jmx.remote.security.MBeanServerAccessController.invoke(MBeanServerAccessController.java:468)
[2023-12-13 20:25:17,498] INFO: [server0103.server.lan] [err] at java.management.rmi/javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1466)
[2023-12-13 20:25:17,498] INFO: server0103.server.lan-stderr: at java.management.rmi/javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1466)
[2023-12-13 20:25:17,498] INFO: [server0103.server.lan] [err] at java.management.rmi/javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1307)
[2023-12-13 20:25:17,498] INFO: server0103.server.lan-stderr: at java.management.rmi/javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1307)
[2023-12-13 20:25:17,498] INFO: [server0103.server.lan] [err] at java.base/java.security.AccessController.doPrivileged(Native Method)
[2023-12-13 20:25:17,499] INFO: server0103.server.lan-stderr: at java.base/java.security.AccessController.doPrivileged(Native Method)
[2023-12-13 20:25:17,499] INFO: [server0103.server.lan] [err] at java.management.rmi/javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1406)
[2023-12-13 20:25:17,499] INFO: server0103.server.lan-stderr: at java.management.rmi/javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1406)
[2023-12-13 20:25:17,499] INFO: [server0103.server.lan] [err] at java.management.rmi/javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:827)
[2023-12-13 20:25:17,499] INFO: server0103.server.lan-stderr: at java.management.rmi/javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:827)
[2023-12-13 20:25:17,499] INFO: [server0103.server.lan] [err] at java.base/jdk.internal.reflect.GeneratedMethodAccessor36.invoke(Unknown Source)
[2023-12-13 20:25:17,500] INFO: server0103.server.lan-stderr: at java.base/jdk.internal.reflect.GeneratedMethodAccessor36.invoke(Unknown Source)
[2023-12-13 20:25:17,500] INFO: [server0103.server.lan] [err] at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[2023-12-13 20:25:17,500] INFO: server0103.server.lan-stderr: at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[2023-12-13 20:25:17,500] INFO: [server0103.server.lan] [err] at java.base/java.lang.reflect.Method.invoke(Method.java:566)
[2023-12-13 20:25:17,500] INFO: server0103.server.lan-stderr: at java.base/java.lang.reflect.Method.invoke(Method.java:566)
[2023-12-13 20:25:17,500] INFO: [server0103.server.lan] [err] at java.rmi/sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:359)
[2023-12-13 20:25:17,501] INFO: server0103.server.lan-stderr: at java.rmi/sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:359)
[2023-12-13 20:25:17,501] INFO: [server0103.server.lan] [err] at java.rmi/sun.rmi.transport.Transport$1.run(Transport.java:200)
[2023-12-13 20:25:17,501] INFO: server0103.server.lan-stderr: at java.rmi/sun.rmi.transport.Transport$1.run(Transport.java:200)
[2023-12-13 20:25:17,501] INFO: [server0103.server.lan] [err] at java.rmi/sun.rmi.transport.Transport$1.run(Transport.java:197)
[2023-12-13 20:25:17,501] INFO: server0103.server.lan-stderr: at java.rmi/sun.rmi.transport.Transport$1.run(Transport.java:197)
[2023-12-13 20:25:17,501] INFO: [server0103.server.lan] [err] at java.base/java.security.AccessController.doPrivileged(Native Method)
[2023-12-13 20:25:17,502] INFO: server0103.server.lan-stderr: at java.base/java.security.AccessController.doPrivileged(Native Method)
[2023-12-13 20:25:17,502] INFO: [server0103.server.lan] [err] at java.rmi/sun.rmi.transport.Transport.serviceCall(Transport.java:196)
[2023-12-13 20:25:17,502] INFO: server0103.server.lan-stderr: at java.rmi/sun.rmi.transport.Transport.serviceCall(Transport.java:196)
[2023-12-13 20:25:17,502] INFO: [server0103.server.lan] [err] at java.rmi/sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:562)
[2023-12-13 20:25:17,502] INFO: server0103.server.lan-stderr: at java.rmi/sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:562)
[2023-12-13 20:25:17,502] INFO: [server0103.server.lan] [err] at java.rmi/sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:796)
[2023-12-13 20:25:17,502] INFO: server0103.server.lan-stderr: at java.rmi/sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:796)
[2023-12-13 20:25:17,503] INFO: [server0103.server.lan] [err] at java.rmi/sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:677)
[2023-12-13 20:25:17,503] INFO: server0103.server.lan-stderr: at java.rmi/sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:677)
[2023-12-13 20:25:17,503] INFO: [server0103.server.lan] [err] at java.base/java.security.AccessController.doPrivileged(Native Method)
[2023-12-13 20:25:17,503] INFO: server0103.server.lan-stderr: at java.base/java.security.AccessController.doPrivileged(Native Method)
[2023-12-13 20:25:17,503] INFO: [server0103.server.lan] [err] at java.rmi/sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:676)
[2023-12-13 20:25:17,503] INFO: server0103.server.lan-stderr: at java.rmi/sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:676)
[2023-12-13 20:25:17,504] INFO: [server0103.server.lan] [err] at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
[2023-12-13 20:25:17,504] INFO: server0103.server.lan-stderr: at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
[2023-12-13 20:25:17,504] INFO: [server0103.server.lan] [err] at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
[2023-12-13 20:25:17,504] INFO: server0103.server.lan-stderr: at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
[2023-12-13 20:25:17,504] INFO: [server0103.server.lan] [err] at java.base/java.lang.Thread.run(Thread.java:829)
[2023-12-13 20:25:17,504] INFO: server0103.server.lan-stderr: at java.base/java.lang.Thread.run(Thread.java:829)
[2023-12-13 20:25:17,504] INFO: [server0103.server.lan] [err]
[2023-12-13 20:25:17,505] INFO: server0103.server.lan-stderr:
[2023-12-13 20:25:17,505] ERROR: Some nodes failed to create the snapshot.
[2023-12-13 20:25:17,505] ERROR: This error happened during the cluster backup: Some nodes failed to create the snapshot.
Traceback (most recent call last):
File "/usr/share/cassandra-medusa/lib/python3.9/site-packages/medusa/backup_cluster.py", line 71, in orchestrate
backup.execute(cql_session_provider)
File "/usr/share/cassandra-medusa/lib/python3.9/site-packages/medusa/backup_cluster.py", line 153, in execute
self._create_snapshots()
File "/usr/share/cassandra-medusa/lib/python3.9/site-packages/medusa/backup_cluster.py", line 170, in _create_snapshots
raise Exception(err_msg)
Exception: Some nodes failed to create the snapshot.
[2023-12-13 20:25:17,506] ERROR: Something went wrong! Attempting to clean snapshots and exit.
[2023-12-13 20:25:17,506] INFO: Executing "nodetool -Dcom.sun.jndi.rmiURLParsing=legacy --ssl -u USER -pw PASSWORD -h server0101.server.lan -p 7199 clearsnapshot -t medusa-202312132025" on following nodes ['server0102.server.lan', 'server0101.server.lan', 'server0103.server.lan'] with a parallelism/pool size of 1
[2023-12-13 20:25:25,603] INFO: Job executing "nodetool -Dcom.sun.jndi.rmiURLParsing=legacy --ssl -u USER -pw PASSWORD -h server0101.server.lan -p 7199 clearsnapshot -t medusa-202312132025" ran and finished Successfully on all nodes.
[2023-12-13 20:25:25,619] INFO: All nodes successfully cleared their snapshot.
Additionally one node backup works without issues (I think). At least there are no errors reported. And copy/paste nodetool snapshot command from logs works without any issues :(.
Hi @ap1und1
From the below log entry that you have posted, it seems you have provided hostname for nodetool_host
in medusa.ini
file. Due to this nodetool -Dcom.sun.jndi.rmiURLParsing=legacy --ssl -u USER -pw PASSWORD -h server0101.server.lan -p 7199 snapshot -t medusa-202312132025
is run on all the nodes with server0101.server.lan
. So it is like running the nodetool snapshot on server0101.server.lan
three time. That is the reason it fails with Snapshot medusa-202312132025 already exists.
[2023-12-13 20:25:13,619] INFO: Executing "nodetool -Dcom.sun.jndi.rmiURLParsing=legacy --ssl -u USER -pw PASSWORD -h server0101.server.lan -p 7199 snapshot -t medusa-202312132025" on following nodes ['server0102.server.lan', 'server0101.server.lan', 'server0103.server.lan'] with a parallelism/pool size of 500
Project board link
Hello, I am experimenting with cassandra-medusa with local storage provider and plan to use it in our Cassandra cluster for database backup and recovery.
My environment is:
When I run
medusa backup-cluster
command, it will first reportsand then it will consistently fail with:
Here is the output when I check out the stderr file in the temporary directory medusa created, it says "Snapshot medusa-202310270129 already exists." but in fact this is the first time I tirgger the backup and all I did was
medusa backup-cluster
.Here is the complete config I am using:
any suggestions are appreciated, thanks !
┆Issue is synchronized with this Jira Story by Unito ┆Issue Number: MED-21