scylladb / scylla-tools-java

Apache Cassandra, supplying tools for Scylla
Apache License 2.0
53 stars 85 forks source link

build: take care of old libthrift #353

Closed avelanarius closed 1 year ago

avelanarius commented 1 year ago

scylla-tools-java is a fork of Apache Cassandra 3.x source tree. This codebase has a dependency on libthrift 0.9.2. This is a very old version of that library and security scanners (such as Trivy) report that this dependency is vulnerable to CVE-2018-1320 and CVE-2019-0205 ("HIGH" severity). Therefore there is a need to somehow take care of this dependency.

The earliest version of libthrift that is not flagged by security scanners is libthrift 0.15.0. Unfortunately, there are many breaking changes between libthrift 0.9.2 and 0.15.0. Apache Cassandra 3.x (our upstream) has decided that this issue won't be fixed in their repository:

https://issues.apache.org/jira/browse/CASSANDRA-15424

and as of November 8th, recently released and still maintained Apache Cassandra 3.11.16 still uses an old version of libthrift.

Therefore, our fork has to deal with this problem on our own. In the past we contemplated on deprecating and removing Thrift: https://github.com/scylladb/scylladb/issues/3811, so my primary approach was to delete Thrift support from scylla-tools-java and not waste engineering hours on maintaining Thrift.

Two ways of achieving that goal were REJECTED:

1) Disabling Thrift in the build system. The source code that uses Thrift is not contained to a separate module, so it doesn't look like it can be disabled solely by changing build configuration. 2) Removing Thrift code completely. Since it's not possible to just disable Thrift in the project build system, the code has to be deleted manually. Unfortunately trying to do so resulted in having to modify vast portions of the codebase. More worryingly, many non-trivial changes were required in cassandra-stress codebase.

This patchset takes the third approach: actually updating libthrift dependency to a newer version and intelligently dealing with any compile-time breakage.

The goal of dealing with the breakage is not to actually properly migrate from libthrift 0.9.2 to 0.15.0. For all intents and purposes, Thrift functionality in scylla-tools-java should be considered not working and not supported. This approach should be considered a noninvasive shortcut to approach 2), free from any bugs related to extensively modifying cassandra-stress and Cassandra codebase.

The required code modifications to handle breaking API changes in libthrift 0.15.0 turned out to be relatively small (contained in "[PART 2/2]"). In situations were some code would have to be changed significantly, those places are replaced with stubs ("Thrift is not supported in scylla-tools-java").

The "[PART 1/2]" commit contains changes from re-running a newer version of Thrift compiler (Java source files generator) - Thrift Compiler 0.14.0 instead of 0.9.1, because newer libthrift was not compatible with previously automatically generated code.

The "[PART 2/2]" commit contains manual fixes to all breakage related to migrating from libthrift 0.9.2 to 0.15.0.

After those fixes, scylla-tools-java compiles correctly with libthrift 0.15.0. The Trivy security scanner doesn't report any problems related to this dependency.

nodetool, cassandra-stress tools seem to function correctly and in normal operation they shouldn't use any Thrift functionality.

The "[PART 2/2]" commit is massively smaller and less complex than what it would be if approach 2) was pursued.

Fixes #352

avelanarius commented 1 year ago

To reviewers: this change should be a 1 commit (for bisactability all changes are required), but I split it into two to make the review process easier. The first commit consist mostly of rerunning automatic Thrift file generation, while the second commit contains only manual changes.

If this PR is going to be merged, I will squash those two commits.

avelanarius commented 1 year ago

I'm open to any suggestions how to deal with this libthrift problem in a cleaner way. Maybe building with older libthrift and swapping JAR files with newer libthrift while packaging, or maybe it will just work without that libthrift JAR (I doubt that)?

nyh commented 1 year ago

I'm open to any suggestions how to deal with this libthrift problem in a cleaner way. Maybe building with older libthrift and swapping JAR files with newer libthrift while packaging, or maybe it will just work without that libthrift JAR (I doubt that)?

There's something I don't understand here... Yes, "scylla-tools-java" is a a complete copy of Cassandra. But - we don't actually use this code to run Cassandra. In particular, unless I'm missing something, we don't need Thrift at all (not just not a specific version - at all!), because Cassandra only uses Thrift when it runs the server, which we don't. Thrift isn't needed (again, IIRC) when reading sstables, providing nodetool or JMX or any of the other things we do with that Java code.

So I think we can remove the thrift JAR - and probably other JARs as well - entirely.

If I'm wrong or misunderstanding something here, please tell me what am I missing.

nyh commented 1 year ago

There's something I don't understand here... Yes, "scylla-tools-java" is a a complete copy of Cassandra. But - we don't actually use this code to run Cassandra. In particular, unless I'm missing something, we don't need Thrift at all (not just not a specific version - at all!), because Cassandra only uses Thrift when it runs the server, which we don't. Thrift isn't needed (again, IIRC) when reading sstables, providing nodetool or JMX or any of the other things we do with that Java code.

So I think we can remove the thrift JAR - and probably other JARs as well - entirely.

Having read your commit message again, I now understand that you actually agree with me, and initially tried to make the build system not compile anything which depends on Thrift, but ran into big difficulties.

I think in the long run, this (only compiling what we need) is what we'll need to do if we keep having this "scylla-tools-java" code (of course, the even better approach is to drop scylla-tools-java completely and do everything in C++, I think this is @denesb 's effort and I think it's correct) . We'll need to only build the small pieces of Cassandra that we really need for processing sstables and whatever we still need scylla-tools-java for - not compiling the vast majority of Cassandra at all.

I agree that if this is too difficult because of ugly inter-dependencies in Cassandra's code, or something, we can do what you did as a quick workaround. But I think eventually, as the Cassandra stuff gets even older and older, we'll need to get rid of most of the stuff we don't need so we don't need to maintain all that gunk.

avelanarius commented 1 year ago

I'm open to any suggestions how to deal with this libthrift problem in a cleaner way. Maybe building with older libthrift and swapping JAR files with newer libthrift while packaging, or maybe it will just work without that libthrift JAR (I doubt that)?

There's something I don't understand here... Yes, "scylla-tools-java" is a a complete copy of Cassandra. But - we don't actually use this code to run Cassandra. In particular, unless I'm missing something, we don't need Thrift at all (not just not a specific version - at all!), because Cassandra only uses Thrift when it runs the server, which we don't. Thrift isn't needed (again, IIRC) when reading sstables, providing nodetool or JMX or any of the other things we do with that Java code.

So I think we can remove the thrift JAR - and probably other JARs as well - entirely.

In case of nodetool, missing libthrift JAR on a JVM classpath doesn't cause any problems (several nodetool commands I tested still work), but cassandra-stress fails to start:

$ ./tools/bin/cassandra-stress
java.lang.NoClassDefFoundError: org/apache/thrift/TBase
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:756)
    at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
    at java.net.URLClassLoader.defineClass(URLClassLoader.java:473)
    at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
    at org.apache.cassandra.stress.Stress.run(Stress.java:80)
    at org.apache.cassandra.stress.Stress.main(Stress.java:62)
Caused by: java.lang.ClassNotFoundException: org.apache.thrift.TBase
    at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
    ... 14 more

I'll look if this is something that could be easily worked around, but generally there's a lot of cassandra-stress files that have a logic for both CQL and Thrift in the same file (so Thrift is needed on the classpath to load the class, even if that code won't get executed in CQL mode).

Note that iin this experiment I removed the libthrift JAR manually after the build process (it's required to build the project).

mykaul commented 1 year ago

This patchset takes the third approach: actually updating libthrift dependency to a newer version and intelligently dealing with any compile-time breakage.

You are braver than I am - I would have removed support altogether.

avelanarius commented 1 year ago

For reference, this is the commit that removed Thrift from Cassandra 4.x: https://github.com/apache/cassandra/commit/4881d9c308ccd6b5ca70925bf6ebedb70e7705fc

Changes in tools/stress directory accurately show what kind of changes would have been required in our fork's cassandra-stress alone.

I doubt that this commit (even parts of it) would cherry pick cleanly, as this commit is from Cassandra 4.x branch.

mykaul commented 1 year ago

For reference, this is the commit that removed Thrift from Cassandra 4.x: apache/cassandra@4881d9c

Changes in tools/stress directory accurately show what kind of changes would have been required in our fork's cassandra-stress alone.

I doubt that this commit (even parts of it) would cherry pick cleanly, as this commit is from Cassandra 4.x branch.

I think it's worth a shot - seem purely mechanical - mostly deleting files, deleting comments, then few changes in reality. I'd give it few hours to try to get it in. I'm much more concerned that our fork is not really up-to-date with upstream, so it's not feasible to even try to resolve conflicts here?

avelanarius commented 1 year ago

I'm much more concerned that our fork is not really up-to-date with upstream, so it's not feasible to even try to resolve conflicts here?

We are relatively up-to-date with 3.11 branch: https://github.com/scylladb/scylla-tools-java/commit/de8289690e12ba3b96168dfb201d01380a9292c3, but not 4.x.

avelanarius commented 1 year ago

I think it's worth a shot - seem purely mechanical - mostly deleting files, deleting comments, then few changes in reality. I'd give it few hours to try to get it in.

This was actually the first approach that I tried, but I was worried that even if I complete it, the resulting PR will be so large and complex that no reviewer will accept it. Cherry picking apache/cassandra@4881d9c results in 104 merge conflicts, so I couldn't present that PR as simply cherry picking commit from Cassandra.

avelanarius commented 1 year ago

But still, I'll try a few approaches (including that apache/cassandra@4881d9c approach)

avelanarius commented 1 year ago

Scylla CI passed with scylla-tools-java that contains this PR: https://github.com/scylladb/scylladb/pull/16009#issuecomment-1804466805

roydahan commented 1 year ago

@scylladb/scylla-tools-maint please review. We need this PR for building new docker without security vulnerabilities

denesb commented 1 year ago

I think in the long run, this (only compiling what we need) is what we'll need to do if we keep having this "scylla-tools-java" code (of course, the even better approach is to drop scylla-tools-java completely and do everything in C++, I think this is @denesb 's effort and I think it's correct) . We'll need to only build the small pieces of Cassandra that we really need for processing sstables and whatever we still need scylla-tools-java for - not compiling the vast majority of Cassandra at all.

We currently use scylla-tools-java for the following tools:

So we are on the way to completely deprecate this repository and all the tools it contains. Therefore we should make the minimum effort only, to keep it afloat.

denesb commented 1 year ago

@avelanarius please rebase on master, there are conflicts in build.xml.

mykaul commented 1 year ago

So we are on the way to completely deprecate this repository and all the tools it contains. Therefore we should make the minimum effort only, to keep it afloat.

Agreed - but these changes are needed for LTS releases that require update of (supposedly) vulnerable deps.

denesb commented 1 year ago

Agreed - but these changes are needed for LTS releases that require update of (supposedly) vulnerable deps.

Yes, I did not refer to this PR. I think this PR is indeed one we need.

avelanarius commented 1 year ago

Rebased on top of latest master.

denesb commented 1 year ago

Submodule update: scylladb/scylladb@323e34e1ed40a8c41a1194c817ec13e38123d1d3

roydahan commented 11 months ago

@denesb looks like this was missed from your backports, we need it to reach to 2023.1

denesb commented 11 months ago

Backports PRs: