thelastpickle / cassandra-medusa

Apache Cassandra Backup and Restore Tool
Apache License 2.0
258 stars 140 forks source link

Restoring a single table with sstableloader option also restores system_distributed #314

Open baileygm opened 3 years ago

baileygm commented 3 years ago

Project board link

MEDUSA_VERSION=0.9.1

When using the restore-node option to restore to a new cluster with the sstable-loader and a single specified non-system table I notice that the system_distributed table is always restored as well as the specified table.

The command line being used is of the following format medusa --fqdn restore-node --backup-name --use-sstableloader --remote --keep-auth --table

Is there an option to prevent the restoration of system tables ?

┆Issue is synchronized with this Jira Story by Unito

baileygm commented 3 years ago

I see that there is an option --ignore-system-keyspaces for the "medusa download" command

Is it possible to perform a node restore in two steps ?:-

  1. Use restore-node download with the --ignore-system-keyspaces option set to false to dowload the s3 data to the node
  2. Use restore-node --sstableloader to restore the downloaded data to the node

This way I could prevent the restoration of the system_distributed table but I cannot see an option where I can prevent the system tables being downloaded during the restore-node operation

adejanovski commented 3 years ago

I think we'd rather include system_distributed and system_traces in the list of keyspaces that should be explicitly specified if you restore select keyspaces/tables as they're not mandatory. That's something that should be easy to do, I'll see if I can send a PR quickly.

baileygm commented 3 years ago

That would excellent. My concern is that if I need to restore a single table or single keyspace to my production system using the sstableloader option I would not want any other non-specified system or non-system tables to also be restored

baileygm commented 3 years ago

I've implemented a temporary fix by simply replacing line 227 in restore_node.py if keyspace == 'system' or keyspace == 'system_schema': with if keyspace == 'system' or keyspace == 'system_schema' or keyspace == 'system_distributed' or keyspace == 'system_traces': Not sure if this what you had in mind but it fixes the problem for me