Ericsson / CodeCompass

CodeCompass is a software comprehension tool for large scale software written in C/C++ and Java
https://codecompass.net
GNU General Public License v3.0
521 stars 102 forks source link

Remove .jar files from source code #660

Closed wbqpk3 closed 8 months ago

wbqpk3 commented 1 year ago

The OpenSSF security test (https://github.com/Ericsson/CodeCompass/issues/659) detected several .jar files in the source code which is a security issue.

Full report:

"score": 4.3,
  "checks": [
    {
      "details": [
        "Warn: binary detected: lib/java/httpclient-4.5.6.jar:1",
        "Warn: binary detected: lib/java/httpcore-4.4.10.jar:1",
        "Warn: binary detected: lib/java/javax.annotation-api-1.3.2.jar:1",
        "Warn: binary detected: lib/java/libthrift-0.13.0.jar:1",
        "Warn: binary detected: lib/java/log4j-1.2.17.jar:1",
        "Warn: binary detected: lib/java/slf4j-api-1.7.25.jar:1",
        "Warn: binary detected: lib/java/slf4j-log4j12-1.7.25.jar:1",
        "Warn: binary detected: plugins/search/lib/java/lucene-analyzers-common-4.9.0.jar:1",
        "Warn: binary detected: plugins/search/lib/java/lucene-core-4.9.0.jar:1",
        "Warn: binary detected: plugins/search/lib/java/lucene-highlighter-4.9.0.jar:1",
        "Warn: binary detected: plugins/search/lib/java/lucene-memory-4.9.0.jar:1",
        "Warn: binary detected: plugins/search/lib/java/lucene-misc-4.9.0.jar:1",
        "Warn: binary detected: plugins/search/lib/java/lucene-queries-4.9.0.jar:1",
        "Warn: binary detected: plugins/search/lib/java/lucene-queryparser-4.9.0.jar:1",
        "Warn: binary detected: plugins/search/lib/java/lucene-suggest-4.9.0.jar:1",
        "Warn: binary detected: plugins/search/lib/java/simplemagic-1.6.jar:1"
      ],
      "score": 0,
      "reason": "binaries present in source code",
      "name": "Binary-Artifacts",
      "documentation": {
        "url": "https://github.com/ossf/scorecard/blob/main/docs/checks.md#binary-artifacts",
        "short": "Determines if the project has generated executable (binary) artifacts in the source repository."
      }
    }

Possible solution is to use a dependency management tool like Maven or Gradle.

mcserep commented 1 year ago

The purpose of this guideline seems a bit unclear to me. The referenced website states that including generated executables in the source repository increases user risk, because binary artifacts cannot be reviewed, allowing possible obsolete or maliciously subverted executables, etc.

While using a dependency management tool like Maven or Gradle would indeed hide this report, would our project have better security? I doubt that, as the binaries in the Maven Central Repository are not undergoing a mandatory review either, and basically anyone can upload anything there. So instead of using our own few binary resources (which rarely change), we would use binaries from an outer, most likely not reviewed source. Not better.

So while this guideline could be valid for projects, which have many and frequently chaning binary dependencies; specifically for CodeCompass, fulfilling this guideline promises little to zero extra security benefit. Therefore I wouldn't suggest to include either Maven or Gradle as a build depdendency, just to download a dozen of JAR dependencies, which we basically update once in 1-2 years.

wbqpk3 commented 1 year ago

Security wise this isn't really an issue as long as we check the sources of the .jar files before uploading them to GitHub (e.g. with checksums).

Apart from security, it might not be the best practice to store built binaries in a git source tree. Whenever someone clones the repository, it downloads all the previous versions of the binaries which is simply wasting space. Currently, the libraries are small and rarely change, so it's not really a problem. However, if we introduce new libraries in the future, we should also consider using a dependency management tool. Just like NPM is used for Node.js libraries.

What should also be noted is that these older library versions have some vulnerabilities. See: https://mvnrepository.com/artifact/org.apache.thrift/libthrift/0.13.0 https://mvnrepository.com/artifact/org.apache.httpcomponents/httpcore/4.4.10 https://mvnrepository.com/artifact/org.apache.httpcomponents/httpclient/4.5.6 https://mvnrepository.com/artifact/org.slf4j/slf4j-log4j12/1.7.25 https://mvnrepository.com/artifact/log4j/log4j/1.2.17

mcserep commented 1 year ago

Don't get me wrong, in general I am also against storing dependency binaries in a project repo. But in this case we have 1-2MB of binary files required for Lucene and the search plugin in general. It is not likely we will added anything new, maybe update them. They have 2-3 versions currently. Adding Maven / Gradle to the build dependency just for this simply not worth it, I have discussed this with @zporky a few days ago, and we were in agreement.


What should also be noted is that these older library versions have some vulnerabilities.

This is an interesting point though, in this case having a dependency manager with semantic versioning of the dependency requirements could help us. (Although with hash pinning it would still require manual intervention.)

Anyway, security vulnerabilities should be monitored and the dependencies should be updated accordingly if something relevant is found. It is worth to note, that just because a vulnerability was reported for a dependency, it does not mean automatically, that it affects CodeCompass. E.g. org.apache.httpcomponents/httpcore/4.4.10 has a vulnerability related to JUnit and the automated tests for that project, which will surely not affect CodeCompass.

mcserep commented 8 months ago

I will close this, as we are not planning to work on this - for now. Could be reopened later if needed.