JeremyGrosser / tablesnap

Uses inotify to monitor Cassandra SSTables and upload them to S3
BSD 2-Clause "Simplified" License
181 stars 86 forks source link

tableslurp usage? #57

Closed msakrejda closed 5 years ago

msakrejda commented 9 years ago

I'm having a hard time restoring more than one table at time via tableslurp. Is this possible? My tablesnap invocation is:

tablesnap --recursive --auto-add --backup --exclude=/snapshots/\|-tmp-\|cassandra.log --with-sse --name node-name --prefix my-prefix/ my-bucket /cassandra-data

When I invoke tableslurp with the "origin" argument pointing to a keyspace, I get the error Cannot find anything to restore from my-bucket:my-prefix:/path. If I add the table to the origin, that seems to restore fine.

I'm not sure if it's relevant, but my tablesnap-uploaded files in S3 have -listdir.json files only for directories with files in them--directories with only other subdirectories do not have these files.

JeremyGrosser commented 9 years ago

@thekad any ideas?

thekad commented 9 years ago

Can you paste the tableslurp invocation?

msakrejda commented 9 years ago

Sure, something that works for a single sstable from the sample schema in the datastax docs is

tableslurp --owner my-owner --group my-group --name backups/foo/bar my-bucket /cassandra-data/data/my-keyspace/songs-e9789df04ddf11e59b7ff5170a7d912e /restore-path

The credentials are in the environment. If I cut off the "origin" argument path earlier to try to get a full tablespace restore (or full node restore--is that possible?), I get the "Cannot find anything to restore..." message.

thekad commented 9 years ago

Hrm, I don't think that matches the usage from tableslurp --help, try the following first:

export TDEBUG=1 # for extra debugging output
tableslurp --owner <local uid to chmod to> --group <local gid to chgrp to> --name <fqdn> <bucket name> <prefix path> <target directory>

You can try dropping the --owner and --group first then chmod it later, for simplicity. Explaining the elements:

--name = the hostname you used when uploading stuff to s3, because tablesnap namespaces the backups based on the hostname, this must match whatever hostname the sstables got uploaded from <bucket name> = the name of the bucket you are restoring from <prefix path> = the path inside the namespaced directory in the bucket you want to restore from <target directory> = your local directory to download sstables to

So, if you want to restore the most recent tablesnapped backup from hostname gloop.work.com from your bucket named com.work.cassandra-backups to the target directory /mnt/cassandra/lib and the original sstables lived under /var/lib/cassandra you should use:

tableslurp --name gloop.work.com com.work.cassandra-backups /var/lib/cassandra /mnt/cassandra/lib

the above should match any files in s3://com.work.cassandra-backups/gloop.work.com:/var/lib/cassandra

The tool obviously needs work if the usage is convoluted, but let's try this first.

thekad commented 9 years ago

After taking another look... I think your only mistake was --name

msakrejda commented 9 years ago

Thanks, but I think I actually redacted it a little too eagerly. I'm calling tablesnap with --prefix backups/foo/ --name node-name and tableslurp with --name backups/foo/node-name since as far as I can tell, tableslurp doesn't support --prefix (what you are calling "prefix path" above I believe is called "origin" in the help blurb and behaves differently since it comes after the node name in the URL, rather than before like tablesnap's --prefix). With this, tableslurp seems to be looking at the right node paths (as tablesnap just prepends the prefix to the node name when determining S3 paths as far as I can tell). I definitely can get tableslurp to restore individual tables if I pass it a full path directly to the table data in S3 without changing anything else, and I can confirm it's hitting endpoints I expect by inspecting my bucket via the S3 console.

Incidentally, I'm using --prefix with tablesnap because I'm backing up multiple clusters to the same bucket, and I want to organize backups by cluster as well as by node.

I've tried to follow the source on how restores iterate over all the relevant files, but it seems to look for -listdir.json files directly in the "origin" path to start with, and my tablesnap backup has none at the keyspace level or above--they only seem to show up inside directories that actually contain other files.

I've tried removing the user and group flags to simplify things as you'd suggested, but still get the same error. I've also tried TDEBUG=1, but there's no additional information (it does print

tableslurp [2015-09-07 22:16:21,698] DEBUG Connecting to s3
tableslurp [2015-09-07 22:16:21,750] DEBUG Connected to s3

before it fails as before, but that's the only DEBUG-level output).

thekad commented 9 years ago

I see, yeah I believe the prefix support in tablesnap came after tableslurp got written, and that support hasn't been added. Anyway, yeah I believe tableslurp does require the directory you're trying to pull to have a -listdir.json file in it... Have you tried passing --file /some/path/to/a/sstable-listdir.json to tableslurp? as per https://github.com/JeremyGrosser/tablesnap/blob/master/tableslurp#L117

msakrejda commented 9 years ago

I see, yeah I believe the prefix support in tablesnap came after tableslurp got written, and that support hasn't been added.

Ah, got it.

Have you tried passing --file /some/path/to/a/sstable-listdir.json to tableslurp? as per https://github.com/JeremyGrosser/tablesnap/blob/master/tableslurp#L117

I have, but I only have -listdir.json files in the table-level S3 directories, so I'd have to recover each table in each keyspace individually, which seems like a lot of work to orchestrate a full recovery... Am I doing something wrong in my tablesnap invocation that I don't have any tablespace-level (or above) -listdir.json files? Or is this a limitation of the tool? And if the latter, what's the recommended way of doing a full node restore (if any)?

jwojcik-zz commented 9 years ago

I have exactly the same question.

How would one do a full restore of all column families from a point in time? I don't think that is supported at the moment.

Given a path to the keyspace and a time, tableslurp could then find in the directory below all of the json files which were created immediately prior and use those files to restore data.

Default behavior would be for time=now. The most recent json file for each column family would be used in this case.

Good work on tablesnap btw. We are excited to put it into production.

thekad commented 9 years ago

So the -listdir.json files store a copy of the entire contents of the directory at that point in time, meaning if you point tableslurp to a single file, it will pull all the files from said keyspace that existed at the moment that sstable got snapshotted. @uhoh-itsmaciek can you make sure the contents of your -listdir.json matches my words?

msakrejda commented 9 years ago

@thekad all (well, based on a sample) my -listdir.json only have entries for files in the same directory. Since my tablespace directories only have subdirectories in them, and no other files, the tablespace directories do not have any -listdir.json files at all.

msakrejda commented 9 years ago

And if I'm reading this correctly, this seems to be as intended? I believe that code is only looking in the current directory of each new SSTable file, right?

tamsky commented 9 years ago

I'm also confused as to the cleanest way to backup all CF using tablesnap and then restore them all using tableslurp (also unknown is the exact mechanics for loading them once slurped to disk, and getting C* to load).

Can someone outline a successful tried-and-true recipe that includes both, since tableslurp would seem to depend on how tablesnap is invoked?

I'm not sure if I should be using snapshots at all. If I should be using snapshots, the mechanics are not clear:

jwojcik-zz commented 9 years ago

"Can someone outline a successful tried-and-true recipe that includes both, since tableslurp would seem to depend on how tablesnap is invoked?"

Backing up all using tablesnap is easy enough. Just give the path to the keyspace on the command line. Tablesnap will look at everything below.

As far as a full restore, you could probably just use s3cmd and copy everything back from s3 to the local machine, then run a repair on that node. I'd think cassandra would just ignore any old files that it didn't need. To clean up the old files, you would delete any files not in the most recent tablesnap created json file in the CF directory. Of course, I'd test this to see if it worked.

For snapshots - You should use snapshots if you would like a few days of snapshots on the filesystem. This way you won't have to go to s3 after a restore. Right now, I am not using them.

But if you do use snapshots, remember to --exclude the snapshot and backups directories so that these don't get copied up to s3. I use:

--exclude \/snapshots\/|\/backups\/|-tmp$

Your snapshots should be ignored by tablesnap as it backs up the sstables as they are created. Since the snapshot is just a hardlink to some previously created sstable, by the time the hardlink gets created tablesnap would have already copied that same file to s3. If you don't --exclude the snapshots, tablesnap will copy them S3 and you'll have the same sstable in S3 twice.

But if you do decide to use snapshots with --exclude in addition to tablesnap, you'll need to remember to clear out the old snapshots so your don't run out of space on your filesystem.

I think the --backup flag is for when tablesnap starts - it will backup any files that are under the given path.

tamsky commented 9 years ago

@jwojcik thanks for your reply

Please correct me if I'm reading it wrong, but it seems like after a certain point in the text, opposite day begins:

So, more or less your snapshots should be ignored by tableslurpsnap as it backs up the sstables as they are created. Since the snapshot is just a hardlink to some previously created sstable, by the time the hardlink gets created tableslurpsnap would have already copied it to s3.

But if you do decide to use snapshots in addition to tableslurpsnap, you need to remember to clear out the old snapshots so your don't run out of space on your filesystem.

I think the --backup flag is for when tableslurpsnap starts - it will backup any files that are under the given path.

tamsky commented 9 years ago

The OP's original question remains unanswered:

I'm having a hard time restoring more than one table at time via tableslurp. Is this possible?

Can someone shed some light on it?

msakrejda commented 9 years ago

Just to follow up for the others who've joined in seeking the same answer--I'm still very much interested but I am not actively investigating this by myself right now. I can provide additional info or try out experimental patches, though.

The propsoed alternative s3cmd-based restore mechanism is clever, but it could be dramatically slower unless the purging interval is really aggressive since there will be a lot of extra (irrelevant) data to download.

jwojcik-zz commented 9 years ago

I made the edits you suggested and clarified a few points. Thanks for letting me know.

On Thu, Sep 17, 2015 at 7:21 PM, Marc Tamsky notifications@github.com wrote:

@jwojcik https://github.com/jwojcik thanks for your reply https://github.com/JeremyGrosser/tablesnap/issues/57#issuecomment-141254900

Please correct me if I'm reading it wrong, but it seems like after a certain point in the text, opposite day begins:

So, more or less your snapshots should be ignored by tableslurpsnap as it backs up the sstables as they are created. Since the snapshot is just a hardlink to some previously created sstable, by the time the hardlink gets created tableslurpsnap would have already copied it to s3.

But if you do decide to use snapshots in addition to tableslurpsnap, you need to remember to clear out the old snapshots so your don't run out of space on your filesystem.

I think the --backup flag is for when tableslurpsnap starts - it will backup any files that are under the given path.

— Reply to this email directly or view it on GitHub https://github.com/JeremyGrosser/tablesnap/issues/57#issuecomment-141271175 .

thekad commented 9 years ago

Fwiw I do agree this is a problem, and I had a bit of a tableslurp rewrite going at some point, should probably dust it off. On Sep 17, 2015 9:16 PM, "John Wojcik" notifications@github.com wrote:

I wrote it at the end of the day and must have been a bit fried. Very kind of you to proofread and provide corrections. They all look good. Much appreciated.

On Thu, Sep 17, 2015 at 7:21 PM, Marc Tamsky notifications@github.com wrote:

@jwojcik https://github.com/jwojcik thanks for your reply < https://github.com/JeremyGrosser/tablesnap/issues/57#issuecomment-141254900

Please correct me if I'm reading it wrong, but it seems like after a certain point in the text, opposite day begins:

So, more or less your snapshots should be ignored by tableslurpsnap as it backs up the sstables as they are created. Since the snapshot is just a hardlink to some previously created sstable, by the time the hardlink gets created tableslurpsnap would have already copied it to s3.

But if you do decide to use snapshots in addition to tableslurpsnap, you need to remember to clear out the old snapshots so your don't run out of space on your filesystem.

I think the --backup flag is for when tableslurpsnap starts - it will backup any files that are under the given path.

— Reply to this email directly or view it on GitHub < https://github.com/JeremyGrosser/tablesnap/issues/57#issuecomment-141271175

.

— Reply to this email directly or view it on GitHub https://github.com/JeremyGrosser/tablesnap/issues/57#issuecomment-141338931 .

jwojcik-zz commented 9 years ago

"I'm having a hard time restoring more than one table at time via tableslurp. Is this possible?"

This does not appear to be current functionality. But with a list of tables, it is easy enough to create a bash one liner to do it.

ls /var/lib/cassandra/data/keyspace/ | xargs -t -I {} tableslurp -k AWS_KEY -s AWS_SECRET  bucketname /var/lib/cassandra/data/keyspace/{} /var/lib/cassandra/data/keyspace/{}
thekad commented 9 years ago

That should be correct. This is because (currently) tablesnap only uploads -listdir.json files per sstable, so only directories with sstables can be... Explored? By tableslurp. On Sep 18, 2015 7:33 AM, "John Wojcik" notifications@github.com wrote:

"I'm having a hard time restoring more than one table at time via tableslurp. Is this possible?"

This does not appear to be current functionality. But with a list of tables, it is easy enough to create a bash one liner to do it.

ls /var/lib/cassandra/data/keyspace/ | xargs -t -I {} tableslurp -k AWS_KEY -s AWS_SECRET bucketname /var/lib/cassandra/data/keyspace/{} /var/lib/cassandra/data/keyspace/{}

— Reply to this email directly or view it on GitHub https://github.com/JeremyGrosser/tablesnap/issues/57#issuecomment-141470331 .

tightly-clutched commented 8 years ago

I can monitor and backup a whole keyspace with tablesnap using -r option. Is there any effort to make tableslurp act recursively to restore a complete backup created with tablesnap? I'm guessing it's low priority, since this issue is still open :) and I don't think it's going to be easy either now that cassandra appends UUIDs to the foldernames.

juiceblender commented 7 years ago

I created a PR https://github.com/JeremyGrosser/tablesnap/pull/86 though I'm not sure if you will still need it 1.5 years down...

tightly-clutched commented 7 years ago

I've been using datastax opscenter to backup my cluster, and it's not satisfactory. I can try tablesnap/tableslurp again on a dev cluster.