scylladb / scylla-tools-java

Apache Cassandra, supplying tools for Scylla
Apache License 2.0
53 stars 84 forks source link

backport uuid identifier related changes to support uuid based sstable identifiers #333

Open tchaikov opened 1 year ago

tchaikov commented 1 year ago

uuid-based sstable identifier was introduced by Cassandra upstream in https://github.com/apache/cassandra/commit/0040fea3797ea3e497691e9d1e2660711c60ac4d. but our fork was based 3.x. so to support this feature, we need to merge the upstream changes or rebase on top of it. or, as put by Avi in https://github.com/scylladb/scylladb/pull/13932#issuecomment-1582277800, quoted here as well

We should probably drop sstableloader. We might develop a standalone format converter from Cassandra formats to our format, based on the Cassandra codebase. So migration would use either the Spark migrator, or (Cassandra sstables -> Scylla sstables) -> load'n'stream. This gives us one less reason to keep having tools/java/.

Note that Cassandra is now at "oa" format: https://github.com/apache/cassandra/commit/f16fb6765b8a3ff8f49accf61c908791520c0d6e.

without this change following test would be failing after enabling the uuid_sstable_identifier_enabled option introduced by https://github.com/scylladb/scylladb/pull/13932

the tests above are only a subset. following tools should be updated

tchaikov commented 1 year ago

see also https://github.com/scylladb/scylladb/pull/13932

denesb commented 1 year ago

I want to patch out sstabledump from all dtests. But I don't know when I will get around to do that, and certainly don't want to block the uuid migration on this.

tchaikov commented 1 year ago

@denesb hi Botond, thank you for the remarks! the recent changes of

can enable us to proceed without being blocked by this issue or your initiative to ditch sstabledump. and your change will allow revert some of these dtest changes.

tchaikov commented 1 year ago

@denesb hi Botond, i am going to add a wrapper around scylla dump-data to replace sstabledump, what do you think? i think once we have https://github.com/scylladb/scylladb/pull/14726 . we will be able to drop sstabledump.

juliayakovlev commented 1 year ago

sstablemetadata also fails on uuid-based sstable identifier

tchaikov commented 1 year ago

sstablemetadata also fails on uuid-based sstable identifier

thank you. added to the list.

tchaikov commented 1 year ago

an alternative of backport uuid identifier changes is to implement it right in scylla. see https://github.com/scylladb/scylladb/issues/14856

fruch commented 1 year ago

@tchaikov

this is breaking rolling upgrades tests as well https://github.com/scylladb/scylla-cluster-tests/blob/6e3d34cd1b9fc6533a4e5a3874fd36ad401b9a81/upgrade_test.py#L698

and a test for GC of tomestones: https://github.com/scylladb/scylla-cluster-tests/blob/6e3d34cd1b9fc6533a4e5a3874fd36ad401b9a81/longevity_tombstone_gc_test.py#L42

mykaul commented 10 months ago

@denesb hi Botond, i am going to add a wrapper around scylla dump-data to replace sstabledump, what do you think? i think once we have scylladb/scylladb#14726 . we will be able to drop sstabledump.

@tchaikov - is this still the plan?

tchaikov commented 10 months ago

@mykaul no, in an offline discussion with Botond, he warned me that the output format of scylla dump-data was different from that of sstabledump. so unless we translate the former into the latter in the wrapper i imagined, it's a no-go. so we are actively replacing sstabledump in our tests. and mark it deprecated, and then plan to remove it in favor of scylla sstable after a grace period.

mykaul commented 10 months ago

@mykaul no, in an offline discussion with Botond, he warned me that the output format of scylla dump-data was different from that of sstabledump. so unless we translate the former into the latter in the wrapper i imagined, it's a no-go. so we are actively replacing sstabledump in our tests. and mark it deprecated, and then plan to remove it in favor of scylla sstable after a grace period.

I don't mind deprecating it, but it has docs implications. https://opensource.docs.scylladb.com/stable/operating-scylla/admin-tools/sstabledump.html for example.

tchaikov commented 10 months ago

https://opensource.docs.scylladb.com/stable/operating-scylla/admin-tools/sstabledump.html

@mykaul hi Yaniv, please take a look at the master version https://opensource.docs.scylladb.com/master/operating-scylla/admin-tools/sstabledump.html . we are deprecating it.