scylladb / scylla-tools-java

Apache Cassandra, supplying tools for Scylla
Apache License 2.0
53 stars 85 forks source link

Separate cassandra-stress to its own build, package, container #370

Closed mykaul closed 3 months ago

mykaul commented 10 months ago

As we depart from Java based tooling, slowly but surely, we do not need to continue and have this Java overhead with us everywhere (specific example - no use of cassandra-stress in a Dockerized Scylla - so why have Java there?).

We should split it off and if we wish to, provide it separately.

CC @yaronkaikov

avikivity commented 10 months ago

We do. It could have its own release schedule too (more or less when we want to update the driver)

fruch commented 9 months ago

FYI,

the dockerfile we are using in SCT, when we need a specific version of c-s: https://github.com/scylladb/scylla-cluster-tests/blob/master/docker/cassandra-stress/Dockerfile-src

mykaul commented 8 months ago

CC @roydahan for awareness.

roydahan commented 8 months ago

I'm aware of it :) It's in current sprint plan.

The question is if it's the last part needed to take Java out of scylla-tools or we need to prioritze more/other tasks?

mykaul commented 8 months ago

We did not complete yet the move to the native tool. Therefore, we cannot remove Java just yet. But we can certainly begin by moving away c-s to its own RPM, not packaged by the rest of the scylla tools RPM, and not installed by default (which was always odd anyway)

fruch commented 8 months ago

We did not complete yet the move to the native tool. Therefore, we cannot remove Java just yet. But we can certainly begin by moving away c-s to its own RPM, not packaged by the rest of the scylla tools RPM, and not installed by default (which was always odd anyway)

Do we want to move c-s to its own repo, like we did with cqlsh ? or to leave in place, and just split the packaging it's a separate one ?

mykaul commented 8 months ago

We did not complete yet the move to the native tool. Therefore, we cannot remove Java just yet. But we can certainly begin by moving away c-s to its own RPM, not packaged by the rest of the scylla tools RPM, and not installed by default (which was always odd anyway)

Do we want to move c-s to its own repo, like we did with cqlsh ? or to leave in place, and just split the packaging it's a separate one ?

Whatever is easier. I don't see the value in moving to its own repo, but perhaps there could be reasons for it.

roydahan commented 8 months ago

See mail with the subject "Retiring tools/java and tools/jmx". c-s can remain in scylla-tools-java, but we need to take care of building it once it's seprated from scylla and add the documentation how to install and run it.

syuu1228 commented 8 months ago

Well, it's difficult separate Java based tools and cassandra-stress. (Java based tools includes: nodetool-java, sstabledump, sstablelevelreset, sstableloader, sstablemetadata, sstablerepairedset)

Because all Java based commands implemented by upstream (Cassandra) are intended as single package, all commands are invoked from single shell script (cassandra.in.sh), all .jar are stored to single directory and the shell script load all of them into CLASSPATH when executing the command. Also there are .jar dependency cassandra-stresss to apache-cassndra.jar, so it seems difficut to completely separate different package.

Even if we need to implement cassandra-stress as separated package, we probably need to have duplicate files needed for cassandra-stress (*.jar, cassandra.in.sh, logback.xml.. etc) and copy them into new directory something like /opt/scylladb/cassandra-stress. This way it does not affect even we drop all Java based codes from scylla-tools package.

But I guess it's much simpler to separate package to new native tools and old Java based tools (which includes cassandra-stress).

syuu1228 commented 8 months ago

On my previous post, I described the difficulity to separate cassandra-stress while we still keep old Java bases tools. But if we can drop all Java based tools and switch to native tools now, we can implement simple cassandra-stress only package from existing scylla-tools package, it will be easy.

syuu1228 commented 8 months ago

Implemented draft code for "cassandra-stress only" packaging: https://github.com/syuu1228/scylla-tools-java/tree/packaging_just_for_cassandra_stress

It will replace scylla-tools -> scylla-cassandra-stress, scylla-tools-core -> scylla-cassandra-stress-core, and only has cassandra-stress, cassandra-stressd commands. (Why keep -core subpackage is just for compatibility - without this we cannot upgrade packages)

roydahan commented 8 months ago

According to @mykaul we can drop the old java tools from packaging. if we will need them we can take them from previous releases.

mykaul commented 8 months ago

@denesb - please confirm my (optimistic) assessment that we can indeed drop all java based tooling.

syuu1228 commented 8 months ago

Implemented draft code for scylla core repo part: https://github.com/syuu1228/scylla/tree/drop_tools_and_jmx

It drops scylla-tools and scylla-jmx entirely. Also it imports nodetool-wrapper to scylla-server package since we drop scylla-tools.

denesb commented 8 months ago

@denesb - please confirm my (optimistic) assessment that we can indeed drop all java based tooling.

Also it imports nodetool-wrapper to scylla-server package since we drop scylla-tools.

This script will be removed soon, once https://github.com/scylladb/scylladb/pull/17168 goes in. Instead, there will be a nodetool script in scylla.git, which simply does exec scylla nodetool $@.

denesb commented 8 months ago

@denesb - please confirm my (optimistic) assessment that we can indeed drop all java based tooling.

Yes. All the important tools (sstable tools, nodetool) have native equivalent. sstableloader is replaced by nodetool refresh -las. There are some minor tools, which have no replacement, but these tools are not used (e.g. sstablereset). See https://github.com/scylladb/scylladb/issues/14856

We will have to check dtest for any remaining usage of the java tools. The best way to find this out, is to patch them out of ccm, then run full dtests and see what breaks.

@tchaikov do you remember how is dtest doing with ditching the java tools?

tchaikov commented 8 months ago

@denesb it's tracked by https://github.com/scylladb/scylla-dtest/issues/3350 which is actually a subtask of https://github.com/scylladb/scylladb/issues/14856 . the only blocker i can see is the 2nd item in https://github.com/scylladb/scylla-dtest/issues/3489

syuu1228 commented 8 months ago

Sent PR at: https://github.com/scylladb/scylla-tools-java/pull/384

syuu1228 commented 8 months ago

Sent scylla-core part as DRAFT PR at: https://github.com/scylladb/scylladb/pull/17969

roydahan commented 4 months ago

@mykaul / @fruch (who is possibly going to own casandra-stress), The question here is Do we want to separate c-s to its own repo and packing or we can keep it with tool-java packaging? Personally, I don't see a point to release only c-s on its own, and we can keep it as part of tools-java.

Right now the direction in this PR is wrong IMO: https://github.com/scylladb/scylla-tools-java/pull/384

mykaul commented 4 months ago

There is no tools-java in the future. We deprecate all of them. Cassandra-stress remains by itself, and I don't wish to install it by default. And if we don't have a reason to do so, release it either.

roydahan commented 4 months ago

The missing key word here is "will" in "we will deprecate them". Right now, AFAIU from @avikivity documents and plan is to keep releasing scylla-tools-java by its own.

So, the change of the packing in the suggested above PR is not good. What I suggest to do is:

  1. Discard the above PR, take care of independent release and packaging of scylla-tools-java (while c-s is part of it).
  2. Long term, separate only the code of c-s to its own repository and build & release it separately (should be done by @fruch team).
mykaul commented 4 months ago

I don't see why in 6.1 or 2024.2 we'll continue to release anything Java related.

roydahan commented 4 months ago

Probably for backup so we have tools that doesn't exist or may not work as expected. It doesn't matter, we can trigger the packaging and releasing on-demand if needed.

roydahan commented 4 months ago

Ok, so after discussing with @fruch we think that the easiest way would be to do as follow:

  1. @fruch and his team responsible of tools will take only the relevant code of c-s out of scylla-tools-java repo and will take care of building, packaging & releasing it.
  2. @yaronkaikov & @syuu1228, scratch the suggested PR of changin the name and complete the separation of scylla-tools-java from all places (including building the core with it and releasing its packages as part of the core building).
mykaul commented 4 months ago

We do 'support' c-s, btw, and even document it - see https://opensource.docs.scylladb.com/stable/operating-scylla/admin-tools/cassandra-stress.html for example.

roydahan commented 4 months ago

We do 'support' c-s, btw, and even document it - see https://opensource.docs.scylladb.com/stable/operating-scylla/admin-tools/cassandra-stress.html for example.

It won't change, it will just move to its own repo with its code only and the bare minimum of java dependencies so later we will be able to easily get rid the rest of the things we currently have in scylla-tools-java.

fruch commented 4 months ago

We still need to remove the tools repo, out of core, nowadays in master:

tarzanek commented 2 months ago

just for ref. new repo hosts the tools from 6.0?, definitely from 6.1 https://github.com/scylladb/cassandra-stress/blob/master/Dockerfile

fruch commented 2 months ago

just for ref. new repo hosts the tools from 6.0?, definitely from 6.1 https://github.com/scylladb/cassandra-stress/blob/master/Dockerfile

It's from a few weeks back, and it's not tied to a release of scylla anymore.

It wasn't yet removed from this repo, and this code is still part of scylla packages.

And docs are not yet updated to point to the new repo / docker.

@tarzanek I understand you used the new docker, any feedback would help, as for example do you need it in other packaging (i.e. deb/rpm, that we don't yet have for he new repo)