java.io.IOException: Snapshot medusa-<....> already exists.

kiddom-kq commented 3 years ago

I've been working on standing up Medusa the last few days. Most of the issues that I hit have workarounds documented:

But this one didn't; i'm creating this issue to document the problem for future benefit.

On an admin node, i would run:

$ medusa backup-cluster ...

Which would then come back with a failure message that appeared to be the same on each node in the cluster

When looking at the /tmp/medusa-job-<...>/stderr file on each node, I would see something like this:

<...>
[2021-08-11 21:56:17,250] DEBUG: https://s3-us-east-2.amazonaws.com:443 "HEAD /some-bucket-name-here/ClusterNameHere/ip-10-69-42-1.us-east-2.compute.internal/None/meta/differential HTTP/1.1" 404 0
[2021-08-11 21:56:17,251] DEBUG: [Storage] Getting object ClusterNameHere/ip-10-69-42-1.us-east-2.compute.internal/None/meta/incremental
[2021-08-11 21:56:17,255] DEBUG: https://s3-us-east-2.amazonaws.com:443 "HEAD /some-bucket-name-here HTTP/1.1" 200 0
[2021-08-11 21:56:17,267] DEBUG: https://s3-us-east-2.amazonaws.com:443 "HEAD /some-bucket-name-here/ClusterNameHere/ip-10-69-42-1.us-east-2.compute.internal/None/meta/incremental HTTP/1.1" 404 0
[2021-08-11 21:56:17,267] INFO: Node ip-10-69-42-1.us-east-2.compute.internal does not have latest backup
[2021-08-11 21:56:17,267] INFO: Starting backup
[2021-08-11 21:56:17,268] INFO: Creating snapshot
[2021-08-11 21:56:17,268] DEBUG: Executing: nodetool -u someUserNameHere -pw somePasswordStringHere snapshot -t medusa-202108112127
error: Snapshot medusa-202108112127 already exists.
-- StackTrace --
java.io.IOException: Snapshot medusa-202108112127 already exists.

A bizarre error because just seconds before the error showed up, the nodetool snapshot command was executed cluster-wide. Something was wrong with the logic that figured out which stage in the backup process this cluster node was at; the cluster node was behaving like no snapshot had yet been created, but that was obviously not true.

Looking closer, it looks like None is showing up in the S3 object path.... which indicates something broke and wasn't caught. Dynamic typed languages are wonderful... right up until they're not :/.

When looking through the S3 bucket, it was clear that some data was being uploaded. There was a directory structure that Medusa had created and some of the metadata files like tokenmap.json had accurate data in them.

However, the total size of all objects in the backup path was about 1 MB which is about a thousand times smaller than the size of data on disk in the cluster.

The problem is not in S3 access; Medusa has no problem creating objects in the bucket for internal use so there must be a problem either at the stage where backup files are created or where the files are uploaded. The logs seemed to back this hunch up:

[2021-08-12 18:52:14,738] CRITICAL: FIND ME: the result of snapshot.find_dirs() is :[]
[2021-08-12 18:52:14,739] INFO: Updating backup index
[2021-08-12 18:52:15,518] INFO: Backup done
[2021-08-12 18:52:15,519] INFO: - Started: 2021-08-12 18:52:08
                        - Started extracting data: 2021-08-12 18:52:12
                        - Finished: 2021-08-12 18:52:15
[2021-08-12 18:52:15,519] INFO: - Real duration: 0:00:03.411112 (excludes time waiting for other nodes)
[2021-08-12 18:52:15,519] INFO: - 0 files, 0.00 B
[2021-08-12 18:52:15,519] INFO: - 0 files copied from host

So I started looking through the Medusa codebase to better understand where nodetool snapshot files are uploaded to S3.

The CRITICAL: FIND ME: log is something I added while debugging / tracing the code flow. The snapshot.find_dirs() call returning an empty list explained why the snapshot data files never made it into S3.

I spent an embarrassing amount of time stepping through and isolating the code that searches through the configured Cassandra data locations looking for snapshot directories, but eventually figured out why my test code returned the correct set of files and directories while Medusa didn't: permissions.

In short, the medusa backup job was running on the Cassandra node as the medusa user. The cassandra user owns all directories:

root@some-cassandra-node-01:/usr/share/cassandra-medusa/lib/python3.8/site-packages/medusa# ls -lah /cassandra/
total 40K
drwxr-xr-x  7 root      root      4.0K Aug 11 20:15 .
drwxr-xr-x 19 root      root      4.0K Aug 11 20:14 ..
drwxr-x---  2 cassandra cassandra 4.0K Aug 12 18:24 commitlog
drwxr-x---  9 cassandra cassandra 4.0K Aug 11 20:15 data
drwxr-x---  2 cassandra cassandra 4.0K Aug 11 20:15 hints
drwx------  2 root      root       16K Aug 11 20:15 lost+found
drwxr-x---  2 cassandra cassandra 4.0K Aug 13 16:22 saved_caches

If the medusa user tries to list the content of /cassandra/data/*/*/snapshots, the ls operation would fail with some sort of Permission denied error.... except this error is silently discarded with glob().

A quick sudo chown -R cassandra:medusa later and backup operations started working:

[2021-08-13 23:18:27,186] INFO: A new backup test02 was created on all nodes.
[2021-08-13 23:18:27,187] INFO: Backup duration: 10484
[2021-08-13 23:18:27,187] INFO: Backup of the cluster done.

Hopefully that helps somebody! I am slowly making my way through the testing setup for Medusa trying to get a better idea of how it all works so I can contribute some patches to Medusa that will focus on sanity checks / logging with the intent of failing as soon as possible rather than letting silent failures propagate!

┆Issue is synchronized with this Jira Task by Unito ┆friendlyId: K8SSAND-1400 ┆priority: Medium

jsanda commented 2 years ago

Please add your planning poker estimate with ZenHub @adejanovski

rzvoncek commented 4 months ago

This is very much still an issue, excep we no longer get the CRITICAL log line. The backup passes through with uploading 0 files.

It'd be nice to print a warning and fail properly instead of pretending all is dandy.

thelastpickle / cassandra-medusa

java.io.IOException: Snapshot medusa-<....> already exists. #390