Open vladzcloudius opened 2 years ago
@mmatczuk @slivne @eliransin FYI
Unless I'm missing something and unless there is an easy workaround this is critical and means that backup restoration procedure described in our docs is broken since the introduction of download-files
.
According to @tgrabiec using the CQL statements does not work in all conditions. And system_schema shall be the last thing to snapshot. It should be perhaps restored with upload dir.
According to @tgrabiec using the CQL statements does not work in all conditions.
IIRC, you're referring to the fact that restoring schema from CQL looses dropped_columns
. So it only works for creating a fresh table, not for importing old data.
According to @tgrabiec using the CQL statements does not work in all conditions.
IIRC, you're referring to the fact that restoring schema from CQL looses
dropped_columns
. So it only works for creating a fresh table, not for importing old data.
I'm not sure I'm following, @tgrabiec. Why would we care about dropped_columns
when we restore user data?
What about my comment about tables IDs? Are you restoring them on the destination node too? What about IDs on system_xx tables themselves? You kinda rely on a fact that the data in the backup is older than the one in the node but what if it's not?
Restoring a schema via CQL has always been a standard procedure for both Scylla and Cassandra from day 1. And it's written all over our and Cassandra's documentation. If there is an issue with this procedure I'd like to know: https://docs.scylladb.com/operating-scylla/procedures/backup-restore/restore/#procedure
Pushing sstables of system_schema
tables was always a last-resort-hack when nothing else worked and we were always doing it only inside the same cluster.
It feels very uncomfortable that we are making it standard now.
According to @tgrabiec using the CQL statements does not work in all conditions. And system_schema shall be the last thing to snapshot. It should be perhaps restored with upload dir.
Everything should be restored via upload
dir - this is our recommended way of uploading sstables into a cluster.
This is the error that I get when I load all system_schema
tables from the upload
first and then try to load a user table's data from the upload
:
TASK [Load system_schema tables data from the upload directory] ****************************************************************************************************************************************************************************************
changed: [35.196.92.133] => (item= system_schema.tables )
changed: [35.237.179.49] => (item= system_schema.keyspaces )
changed: [35.196.92.133] => (item= system_schema.columns )
changed: [35.237.179.49] => (item= system_schema.tables )
changed: [35.196.92.133] => (item= system_schema.scylla_tables )
changed: [35.237.179.49] => (item= system_schema.columns )
changed: [35.196.92.133] => (item= system_schema.keyspaces )
changed: [35.237.179.49] => (item= system_schema.scylla_tables )
changed: [35.196.92.133] => (item= system_schema.views )
changed: [35.237.179.49] => (item= system_schema.aggregates )
changed: [35.196.92.133] => (item= system_schema.functions )
changed: [35.237.179.49] => (item= system_schema.computed_columns )
changed: [35.196.92.133] => (item= system_schema.aggregates )
changed: [35.237.179.49] => (item= system_schema.dropped_columns )
changed: [35.196.92.133] => (item= system_schema.view_virtual_columns )
changed: [35.237.179.49] => (item= system_schema.functions )
changed: [35.196.92.133] => (item= system_schema.types )
changed: [35.237.179.49] => (item= system_schema.indexes )
changed: [35.196.92.133] => (item= system_schema.indexes )
changed: [35.237.179.49] => (item= system_schema.triggers )
changed: [35.196.92.133] => (item= system_schema.triggers )
changed: [35.196.92.133] => (item= system_schema.dropped_columns )
changed: [35.237.179.49] => (item= system_schema.types )
changed: [35.196.92.133] => (item= system_schema.computed_columns )
changed: [35.237.179.49] => (item= system_schema.view_virtual_columns )
changed: [35.237.179.49] => (item= system_schema.views )
TASK [Load the rest of tables data from the upload directory] ******************************************************************************************************************************************************************************************
failed: [35.196.92.133] (item= keyspace2.table2 ) => {"ansible_loop_var": "item", "changed": true, "cmd": "nodetool refresh keyspace2 table2 \n", "delta": "0:00:00.805716", "end": "2022-02-03 14:05:18.614227", "item": " keyspace2.table2 ", "msg": "non-zero return code", "rc": 1, "start": "2022-02-03 14:05:17.808511", "stderr": "", "stderr_lines": [], "stdout": "Using /etc/scylla/scylla.yaml as the config file\nnodetool: Scylla API server HTTP POST to URL '/storage_service/sstables/keyspace2' failed: Keyspace keyspace2 Does not exist\nSee 'nodetool help' or 'nodetool help <command>'.", "stdout_lines": ["Using /etc/scylla/scylla.yaml as the config file", "nodetool: Scylla API server HTTP POST to URL '/storage_service/sstables/keyspace2' failed: Keyspace keyspace2 Does not exist", "See 'nodetool help' or 'nodetool help <command>'."]}
skipping: [35.196.92.133] => (item= system_schema.tables )
skipping: [35.196.92.133] => (item= system_schema.columns )
Which means that scylla doesn't pick up the new schema right away - and I'm not surprised.
@mmatczuk I believe we need to use a standard procedure when we restore schema and stay away from hacks.
I see that the backup snapshot already backs up the schema:
I couldn't find a way to fetch it using SM or SM-agent APIs however. Is there a way?
According to @tgrabiec using the CQL statements does not work in all conditions.
IIRC, you're referring to the fact that restoring schema from CQL looses
dropped_columns
. So it only works for creating a fresh table, not for importing old data.I'm not sure I'm following, @tgrabiec. Why would we care about
dropped_columns
when we restore user data?
SStable reader will fail to read sstables with unknown columns, unless they are marked as dropped in the schema.
So if you're restoring the backup into a fresh cluster, you need to restore schema tables.
If you're just rolling back to a previous snapshot, and you're fine with using the latest schema, you don't do anything with the schema tables, nor with the CQL.
According to @tgrabiec using the CQL statements does not work in all conditions.
IIRC, you're referring to the fact that restoring schema from CQL looses
dropped_columns
. So it only works for creating a fresh table, not for importing old data.I'm not sure I'm following, @tgrabiec. Why would we care about
dropped_columns
when we restore user data?What about my comment about tables IDs? Are you restoring them on the destination node too? What about IDs on system_xx tables themselves? You kinda rely on a fact that the data in the backup is older than the one in the node but what if it's not?
I don't understand your comment about IDs.
Tables don't change IDs, ever. That includes system_xx tables
Of course, you either restore schema tables, or use CQL, not both.
According to @tgrabiec using the CQL statements does not work in all conditions.
IIRC, you're referring to the fact that restoring schema from CQL looses
dropped_columns
. So it only works for creating a fresh table, not for importing old data.I'm not sure I'm following, @tgrabiec. Why would we care about
dropped_columns
when we restore user data? What about my comment about tables IDs? Are you restoring them on the destination node too? What about IDs on system_xx tables themselves? You kinda rely on a fact that the data in the backup is older than the one in the node but what if it's not?I don't understand your comment about IDs.
Tables don't change IDs, ever. That includes system_xx tables
When you create a fresh cluster which you will eventually upload sstables to it is going to create all kinds on system_xx/system tables and each is going to get its ID. And AFAIK - they are going to be unique and different from the ID of same tables in the source cluster.
And these IDs are going to be stored in the system_schema
tables.
Am I missing something, @tgrabiec ?
And if not - I hope this makes more sense now.
And that's why I don't understand how uploading sstables of system_schema
from one cluster into a different cluster can be seen as a safe procedure in a general case.
Hmmm... I see that IDs of all systemxx tables I checked are the same on my local cluster and on other installations.
@tgrabiec Could you, please, remind me what's the input for IDs generator for KS, CF, columns?
Local (not distributed) system tables have static IDs which are calculated as name-based UUID (so depend on name only).
Distributed tables have IDs assigned during creation as new unique time-based UUID.
Keyspaces are identified by name.
Columns are identified by name.
Uploading system_schema is safe if you want the same set of tables in the target cluster as in the source cluster. It's not safe if you have some other tables in the target cluster.
Thanks, Tomek. One question though:
Distributed tables have IDs assigned during creation as new unique time-based UUID.
This means that system_distributes, system_auth, system_traces and system_distributed_everywhere KSes tables are going to have different IDs as those at the source. How pushing system_schema data from the source be safe in their context, @tgrabiec?
Thanks, Tomek. One question though:
Distributed tables have IDs assigned during creation as new unique time-based UUID.
This means that system_distributes, system_auth, system_traces and system_distributed_everywhere KSes tables are going to have different IDs as those at the source. How pushing system_schema data from the source be safe in their context, @tgrabiec?
How is it not safe, if the intent is to restore the whole cluster's state?
@eliransin @slivne I think this is critical
Thanks, Tomek. One question though:
Distributed tables have IDs assigned during creation as new unique time-based UUID.
This means that system_distributes, system_auth, system_traces and system_distributed_everywhere KSes tables are going to have different IDs as those at the source. How pushing system_schema data from the source be safe in their context, @tgrabiec?
How is it not safe, if the intent is to restore the whole cluster's state?
Because system_auth and other distributed system KS tables' IDs are going to be different and these IDs are going to be encoded in names of corresponding directories.
Please, read the Ansible playbook in question to see the full procedure.
In gist we start with the following (all on the destination cluster):
1) Shut the cluster down. 2) Wipe it clean and bootstrap with the same token ring as in the source cluster. At this point system_auth, system_distributed and other KSes I mentioned above are going to be created with IDs that are going to be different from those in the source cluster. 3) Upload data including system_schema which in particular has old IDs of tables above - BUM!!!
I hope it makes more sense now, @tgrabiec.
We don't want to invest our effort into ansible script as the restore supposed to be done with manager since 3.1 release.
We don't want to invest our effort into ansible script as the restore supposed to be done with manager since 3.1 release.
Not sure how closing this issue is related to Ansible, @karol-kokoszka. Could you, please, clarify? In particular how do you plan to restore schema in SM 3.1?
Could you, please, clarify? In particular how do you plan to restore schema in SM 3.1?
SM 3.1 has sctool resore --restore-schema
command, which restores schema from backed-up SSTables.
It does so by uploading all backed-up SSTables from all backed-up nodes into each node in restored cluster. This creates great data duplication, but because schema SSTables are small, this shouldn't be an issue. The reason for doing this is that it also simulates repair, so every node should have correct, identical schema and we don't need to worry about some edge scenarios in terms of compaction/gc_grace_seconds.
This procedure requires user to restart the whole cluster after restore, so that nodes can pick up restored schema.
As for SSTable ID problems, when downloading schema SSTables from all backed-up nodes into given node, SM renames them (by changing their ID), so that we can avoid name conflicts. SM uses load and stream for uploading the data into the cluster, so it takes care of the rest of the problems associated with SSTable IDs.
As for SSTable ID problems, when downloading schema SSTables from all backed-up nodes into given node, SM renames them (by changing their ID), so that we can avoid name conflicts. SM uses load and stream for uploading the data into the cluster, so it takes care of the rest of the problems associated with SSTable IDs.
@karol-kokoszka You are confusing SStables IDs with table IDs This issue is all about the later and has nothing to do with the former.
The algorithm you have described doesn't solve the issue in question at all because table ID is in the content of a system_schema
tables.
Please, re-read the opening message and let me know if you still need clarifications on the matter.
Version: 2.6
Description According to a --dry-run output an agent is downloading not only user tables but also system_xxx keyspaces as well and in particular
system_schema
:And overwriting original
system_schema
doesn't seem to be something that is safe to do. In particular because old table's UUIDs (including system_xxx ones) are stored insystem_schema.tables
.Instead the restoration procedure should include a restoration of the original schema using CQL
CREATE ...
commands (by simply running the output ofDESC SCHEMA
which is supposed to be a part of a backup).