instaclustr / cassandra-ttl-remover

Tool for rewriting SSTables to not contain TTLs
https://instaclustr.com
19 stars 8 forks source link

TTL Remover not working in Cassandra 4. #13

Open kmrmanish23 opened 1 year ago

kmrmanish23 commented 1 year ago

Run.sh for Cassandra 4:

For Cassandra 3 and 4.0

CLASSPATH=$CLASSPATH./impl/target/ttl-remover.jar:./cassandra-4/target/ttl-remover-cassandra-4.jar

change versions of jars on classpath to target 3 or 4

change --cassandra-version if necessary

java -javaagent:/opt/cassandra-ttl-remover/buddy-agent/target/byte-buddy-agent.jar \ -cp "/opt/cassandra-ttl-remover/impl/target/ttl-remover.jar:/opt/cassandra-ttl-remover/cassandra-4/target/ttl-remover-cassandra-4.jar" \ $JVM_OPTS \ com.instaclustr.cassandra.ttl.cli.TTLRemoverCLI \ --cassandra-version=4 \ --sstables \ /var/lib/cassandra/data/cycling \ --output-path \ /var/lib/cassandra/data/cycling/stripped \ --cql \ 'CREATE TABLE IF NOT EXISTS test.test (id uuid, name text, surname text, PRIMARY KEY (id)) WITH default_time_to_live = 10;'

============================================================================= Connected to Test Cluster at 127.0.0.1:9042 [cqlsh 6.0.0 | Cassandra 4.0.7 | CQL spec 3.4.5 | Native protocol v5] Use HELP for help. cqlsh> show version; [cqlsh 6.0.0 | Cassandra 4.0.7 | CQL spec 3.4.5 | Native protocol v5] cqlsh>

Below is the error Log while running run.sh.

root@:/opt/cassandra-ttl-remover# sh run.sh SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. Exception in thread "main" java.util.ServiceConfigurationError: com.instaclustr.cassandra.ttl.SSTableTTLRemover: com.instaclustr.cassandra.ttl.Cassandra4TTLRemover Unable to get public no-arg constructor at java.base/java.util.ServiceLoader.fail(ServiceLoader.java:582) at java.base/java.util.ServiceLoader.getConstructor(ServiceLoader.java:673) at java.base/java.util.ServiceLoader$LazyClassPathLookupIterator.hasNextService(ServiceLoader.java:1233) at java.base/java.util.ServiceLoader$LazyClassPathLookupIterator.hasNext(ServiceLoader.java:1265) at java.base/java.util.ServiceLoader$2.hasNext(ServiceLoader.java:1300) at java.base/java.util.ServiceLoader$3.hasNext(ServiceLoader.java:1385) at java.base/java.util.Iterator.forEachRemaining(Iterator.java:132) at java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913) at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578) at com.instaclustr.cassandra.ttl.cli.TTLRemoverCLI.getTTLRemover(TTLRemoverCLI.java:134) at com.instaclustr.cassandra.ttl.cli.TTLRemoverCLI.run(TTLRemoverCLI.java:100) at picocli.CommandLine.executeUserObject(CommandLine.java:1919) at picocli.CommandLine.access$1100(CommandLine.java:145) at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2332) at picocli.CommandLine$RunLast.handle(CommandLine.java:2326) at picocli.CommandLine$RunLast.handle(CommandLine.java:2291) at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2159) at picocli.CommandLine.execute(CommandLine.java:2058) at com.instaclustr.cassandra.ttl.cli.TTLRemoverCLI.execute(TTLRemoverCLI.java:119) at com.instaclustr.cassandra.ttl.cli.TTLRemoverCLI.main(TTLRemoverCLI.java:77) at com.instaclustr.cassandra.ttl.cli.TTLRemoverCLI.main(TTLRemoverCLI.java:73) Caused by: java.lang.NoClassDefFoundError: org/apache/cassandra/db/lifecycle/ILifecycleTransaction at java.base/java.lang.Class.getDeclaredConstructors0(Native Method) at java.base/java.lang.Class.privateGetDeclaredConstructors(Class.java:3137) at java.base/java.lang.Class.getConstructor0(Class.java:3342) at java.base/java.lang.Class.getConstructor(Class.java:2151) at java.base/java.util.ServiceLoader$1.run(ServiceLoader.java:660) at java.base/java.util.ServiceLoader$1.run(ServiceLoader.java:657) at java.base/java.security.AccessController.doPrivileged(Native Method) at java.base/java.util.ServiceLoader.getConstructor(ServiceLoader.java:668) ... 23 more Caused by: java.lang.ClassNotFoundException: org.apache.cassandra.db.lifecycle.ILifecycleTransaction at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:581) at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178) at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)

================================================================= echo $JAVA_HOME /opt/zulu11.60.19-ca-jdk11.0.17-linux_x64 root@# echo $CASSANDRA_HOME

root@:/opt/cassandra-ttl-remover# echo $CLASSPATH /opt/cassandra-ttl-remover-1.1.2/impl/target/ttl-remover.jar:/opt/cassandra-ttl-remover-1.1.2/cassandra-4/target/ttl-remover-cassandra-4.jar

kmrmanish23 commented 1 year ago

To reproduce the issue , below are the Steps.

1) git clone https://github.com/instaclustr/cassandra-ttl-remover 2) mvn clean install -DskipTests 3) Modify the Run.sh as attached. run.sh.txt

Getting the Below Error after Run:

root@:/opt/cassandra-ttl-remover# sh run.sh SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.

............ Caused by: java.lang.ClassNotFoundException: org.apache.cassandra.db.lifecycle.ILifecycleTransaction.

Kindly suggest if something is missed.

smiklosovic commented 1 year ago

Hi @kmrmanish23 ,

it seems like you are not propagating $CLASSPATH like here:

https://github.com/instaclustr/cassandra-ttl-remover/blob/master/run.sh#L72

You have only this:

    -cp "/opt/cassandra-ttl-remover/impl....... \

where is $CLASSPATH?

kmrmanish23 commented 1 year ago

Thanks for the quick response . I modified the run.sh file .PFA the same . modified run.sh.txt

While running the run.sh now , A new directory is created as below with system & debug.log file. "drwxr-xr-x 2 root root 4096 Jan 31 08:55 cassandra.logdir_IS_UNDEFINED

Below is the content of the debug & system.log file .However no action is taken on the STABLES.

Kindly suggest if I missed something.

root@:/opt/cassandra-ttl-remover# cd cassandra.logdir_IS_UNDEFINED/ root@:/opt/cassandra-ttl-remover/cassandra.logdir_IS_UNDEFINED# cat system.log INFO [main] 2023-01-31 08:48:12,600 JarManifestVersionProvider.java:57 - ttl-remove version: ttl-remove development build, Build time: unknown, Git commit: unknown INFO [main] 2023-01-31 08:52:37,201 JarManifestVersionProvider.java:57 - ttl-remove version: ttl-remove development build, Build time: unknown, Git commit: unknown INFO [main] 2023-01-31 08:54:50,229 JarManifestVersionProvider.java:57 - ttl-remove version: ttl-remove development build, Build time: unknown, Git commit: unknown root@:/opt/cassandra-ttl-remover/cassandra.logdir_IS_UNDEFINED# cat debug.log INFO [main] 2023-01-31 08:48:12,600 JarManifestVersionProvider.java:57 - ttl-remove version: ttl-remove development build, Build time: unknown, Git commit: unknown INFO [main] 2023-01-31 08:52:37,201 JarManifestVersionProvider.java:57 - ttl-remove version: ttl-remove development build, Build time: unknown, Git commit: unknown INFO [main] 2023-01-31 08:54:50,229 JarManifestVersionProvider.java:57 - ttl-remove version: ttl-remove development build, Build time: unknown, Git commit: unknown root@:/opt/cassandra-ttl-remover/cassandra.logdir_IS_UNDEFINED#

smiklosovic commented 1 year ago

You are the most probably hitting this issue, I can not do anything about it.

https://issues.apache.org/jira/browse/CASSANDRA-17773

What is ".PFA the same" ?

What you mean by no action is taken? What is in /var/lib/cassandra/data/cycling/stripped ?

kmrmanish23 commented 1 year ago

Really sorry for asking the basic Question regaridng the CLASSPATH.

After running the command " mvn clean install -DskipTests" , All the related jars are generated in the respective "target" folder under "impl", "buddy-agent" and Cassandra- X " folders.
Below is the extracted files & folders.

root@localhost:/opt/cassandra-ttl-remover# pwd /opt/cassandra-ttl-remover root@qolsysjci-Latitude-5530:/opt/cassandra-ttl-remover# ls -lrt total 56 -rw-r--r-- 1 root root 7749 Jan 30 12:37 README.adoc -rw-r--r-- 1 root root 4854 Jan 30 12:37 pom.xml -rwxr-xr-x 1 root root 7112 Jan 30 12:37 mvnw drwxr-xr-x 4 root root 4096 Jan 30 12:38 buddy-agent drwxr-xr-x 4 root root 4096 Jan 30 12:38 cassandra-2 drwxr-xr-x 4 root root 4096 Jan 30 12:38 cassandra-3 drwxr-xr-x 4 root root 4096 Jan 30 12:38 cassandra-4 drwxr-xr-x 4 root root 4096 Jan 30 12:38 cassandra-4.1 drwxr-xr-x 4 root root 4096 Jan 31 13:20 impl drwxr-xr-x 2 root root 4096 Jan 31 14:06 cassandra.logdir_IS_UNDEFINED -rwxr-xr-x 1 root root 2732 Jan 31 14:58 run.sh root@qolsysjci-Latitude-5530:/opt/cassandra-ttl-remover#

My CASSANDRA_HOME=/home/manish/cassandra/apache-cassandra-4.0.7 JAVA_HOME=/opt/zulu11.60.19-ca-jdk11.0.17-linux_x64

What should be my "$CLASSPATH" value?

Currently it set to

echo $CLASSPATH /opt/cassandra-ttl-remover-1.1.2/impl/target/ttl-remover.jar:/opt/cassandra-ttl-remover-1.1.2/cassandra-4/target/ttl-remover-cassandra-4.jar

kmrmanish23 commented 1 year ago

You are the most probably hitting this issue, I can not do anything about it.

https://issues.apache.org/jira/browse/CASSANDRA-17773 This link is not working .

What is ".PFA the same" ? Please find attached . I have attached the modified run.sh .

What you mean by no action is taken? What is in /var/lib/cassandra/data/cycling/stripped ? This folder is empty . this is for cresting the copy of SStables as the output.

kmrmanish23 commented 1 year ago

Based on the issue reported in https://issues.apache.org/jira/browse/CASSANDRA-17773

I changed the CASSANDRA_LOG_DIR to point to "/var/log/cassandra" .Rest looks fine .

getting below error now :

root@0:/var/log/cassandra# cd /opt/cassandra-ttl-remover/ root@:/opt/cassandra-ttl-remover# sh run.sh INFO [main] 2023-01-31 17:14:13,099 JarManifestVersionProvider.java:57 - ttl-remove version: ttl-remove development build, Build time: unknown, Git commit: unknown Exception in thread "main" java.lang.ExceptionInInitializerError at org.apache.cassandra.io.sstable.Descriptor.fromFilenameWithComponent(Descriptor.java:301) at org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:227) at org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:205) at com.instaclustr.cassandra.ttl.Cassandra4TTLRemover.executeRemoval(Cassandra4TTLRemover.java:45) at com.instaclustr.cassandra.ttl.cli.TTLRemoverCLI.run(TTLRemoverCLI.java:101) at picocli.CommandLine.executeUserObject(CommandLine.java:1919) at picocli.CommandLine.access$1100(CommandLine.java:145) at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2332) at picocli.CommandLine$RunLast.handle(CommandLine.java:2326) at picocli.CommandLine$RunLast.handle(CommandLine.java:2291) at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2159) at picocli.CommandLine.execute(CommandLine.java:2058) at com.instaclustr.cassandra.ttl.cli.TTLRemoverCLI.execute(TTLRemoverCLI.java:119) at com.instaclustr.cassandra.ttl.cli.TTLRemoverCLI.main(TTLRemoverCLI.java:77) at com.instaclustr.cassandra.ttl.cli.TTLRemoverCLI.main(TTLRemoverCLI.java:73) Caused by: java.lang.NullPointerException at org.apache.cassandra.config.DatabaseDescriptor.getNonLocalSystemKeyspacesDataFileLocations(DatabaseDescriptor.java:1920) at org.apache.cassandra.db.Directories.(Directories.java:96) ... 15 more

smiklosovic commented 1 year ago

Hi @kmrmanish23 ,

why do not you use the run.sh script and modify only what you need?

All you need to change is this:

https://github.com/instaclustr/cassandra-ttl-remover/blob/master/run.sh#L61

and this

https://github.com/instaclustr/cassandra-ttl-remover/blob/master/run.sh#L63

Why do you care what is CLASSPATH set to? That is taken care of automatically based on your CASSANDRA_HOME. You need to set CASSANDRA_HOME and CLASSPATH will be set automatically.

I tried the script on the local installation of Cassandra, my Cassandra is in /tmp/c/cassandra and my stripped sstables will be saved in /tmp/stripped

So the command looks like this (BUT YOU NEED TO EXECUTE WHOLE run.sh SCRIPT!!!!)

java -Dcassandra.storagedir=$CASSANDRA_HOME/data -Dcassandra.config=file:///$CASSANDRA_HOME/conf/cassandra.yaml \
    -cp "$CLASSPATH./impl/target/ttl-remover.jar:./cassandra-4.1/target/ttl-remover-cassandra-4.1.jar" \
    $JVM_OPTS \
    com.instaclustr.cassandra.ttl.cli.TTLRemoverCLI \
    --cassandra-version=4 \
    --sstables \
    /tmp/c/cassandra/data/data/test/test-cfb929d0a16911eda1626bfeab53946b \
    --output-path \
    /tmp/stripped \
    --cql \
    'CREATE TABLE IF NOT EXISTS test.test (id uuid, name text, surname text, PRIMARY KEY (id)) WITH default_time_to_live = 10;'

Notice --sstables and --output-path That is all I changed! run.sh script takes care of the rest based on what your CASSANDRA_HOME is.

I am using Cassandra 4.2-SNAPSHOT (4.1 is compatible with that).

If you use Cassandra 4.0.7, you need to comment java command for 4.1 and you need to uncomment this

https://github.com/instaclustr/cassandra-ttl-remover/blob/master/run.sh#L69-L81

Then you need to change --stables and --output-path as I showed above.

If you run Cassandra 4, then you need to modify this

https://github.com/instaclustr/cassandra-ttl-remover/blob/master/run.sh#L72

to reflect Cassandra 4, so change jars instead of 3 to 4.

kmrmanish23 commented 1 year ago

Thanks For the Update .

For the Cassandra 3 & 4 , there is no option to set CASSANDRA_HOME inside the run.sh. the same is there for version 4.1 and 2. If I am running without setting the CLASSPATH , its throwing error to set the CLASSPATH.

see the below code from run.sh.

if [ -z "$CLASSPATH" ]; then echo "You must set the CLASSPATH var" >&2 exit 1 fi

Let me try out this again.

kmrmanish23 commented 1 year ago

I re-installed the entire cassandra with version 4.1 and used the same run.sh file now .

the script existed now after reading the cassandra.yaml file .Please find below the logs.

INFO [main] 2023-02-01 11:06:20,430 DatabaseDescriptor.java:460 - DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap INFO [main] 2023-02-01 11:06:20,431 DatabaseDescriptor.java:514 - Global memtable on-heap threshold is enabled at 1987MiB INFO [main] 2023-02-01 11:06:20,431 DatabaseDescriptor.java:518 - Global memtable off-heap threshold is enabled at 1987MiB INFO [main] 2023-02-01 11:06:20,431 DatabaseDescriptor.java:585 - Native transport rate-limiting disabled. INFO [main] 2023-02-01 11:06:20,596 FBUtilities.java:166 - InetAddress.getLocalHost() was used to resolve listen_address to qolsysjci-Latitude-5530/127.0.1.1, double check this is correct. Please check your node's config and set the listen_address in cassandra.yaml accordingly if applicable.

Please note that there is no error message in the Log .It simply existed without creating any files in the "/tmp/stripped" folder.

kmrmanish23 commented 1 year ago

Please note that listen_address: localhost in the cassandra.yaml file.

Do i need to change it ?

smiklosovic commented 1 year ago

I run run.sh as described against Cassandra 4.1 and all just went fine and SSTables are stripped from ttls.

My modified run.sh is here

https://gist.github.com/smiklosovic/358bc53ed7de6b6599f5a336228e93e8

CASSANDRA_HOME is set to /tmp/c/cassandra by export CASSANDRA_HOME=/tmp/c/cassandra in the shell.

The script is meant to be run from the repository's root directory.

kmrmanish23 commented 1 year ago

Hi Štefan Miklošovič,

Thanks a lot for your guidance. The tool is finally working for me.

Coming to the tool functionality, I have a single Table in Cassandra spread across 10 nodes. So , I have to run this for individual SStables on all the 10 nodes ?

Thanks, Manish

smiklosovic commented 1 year ago

:+1: Yes, obviously.