Enable OpenSSF Scorecard Github Action

joycebrum commented 2 years ago

Hello, I am Joyce and I'm working on behalf of Google and the Open Source Security Foundation to help essential open-source projects improve their supply-chain security. Given impact that ClickHouse has on many companies of all sizes, the OpenSSF has identified it as one of the 100 most critical open source projects.

Describe the solution you'd like

Would you consider adopting an OpenSSF tool called Scorecards? Scorecards runs dozens of automated security checks to help maintainers better understand their project's supply-chain security posture. It is developed by the OpenSSF, in partnership with GitHub.

To simplify maintainers' lives, the OpenSSF has also developed the Scorecard GitHub Action. It is very lightweight and runs on every change to the repository's main branch. The results of its checks are available on to the project's security dashboard, and include suggestions on how to solve any issues (see examples in the Additional context). The Action does not run or interact with any workflows, but merely parses them to identify possible vulnerabilities. This Action has been adopted by 1800+ projects already.

Considering how critical is the role ClickHouse has on most projects, improving the security of the project following the tips and insights from Scorecard could be a good way to improve the overall security and guarantee that the ClickHouse Github Project is mostly secure from malicious sabotage.

Would you be interested in a PR which adds this Action? Optionally, it can also publish your results to the OpenSSF REST API, which allows a badge with the project's score to be added to its README.

Any doubts or concerns please share them with me.

Additional context

Code scanning dashboard with multiple alerts, including Code-Review and Token-Permissions

Detail of a Token-Permissions alert, indicating the specific file and remediation steps

alexey-milovidov commented 2 years ago

@joycebrum I've read the description and have some concerns about the usage.

For example,

Binary-Artifacts

We use binary artifacts for some tests (examples of some binary data formats) and as part of the fuzzing corpus.

Fuzzing

Does the project use fuzzing tools, e.g. OSS-Fuzz?

We use several custom fuzzers and libfuzzer, but not (yet) OSS-Fuzz.

Pinned-Dependencies

We have even more strict practice - all the dependencies are built from the source code. But our repository also contains an enormous amount of tests with Dockerfiles, where dependencies are not pinned.

I'm not sure if the OpenSSF Scorecard tool will be able to correctly get this information.

Also please check our principles for C++ code:

don't use source codes or libraries from OS or repositories;
build everything from source code;
use fixed versions of the libraries, no automatic updates;
all source code of third-party dependencies is checked out;
all the code can be built with sanitizers and hardening;
no known memory safety issues exist;
reproducible (byte-identical), hermetic builds;
all the code can be cross-compiled;
minimize the supply chain;
minimize runtime dependencies on the files on the target machine;
single-binary distribution;
static linking;
no foreign code or plugins can run in the program address space;

joycebrum commented 2 years ago

Hi @alexey-milovidov, thanks for sharing or concerns, they are very relevant questions

Binary-Artifacts

We use binary artifacts for some tests (examples of some binary data formats) and as part of the fuzzing corpus.

About the Binary Artifacts, it is a known issue that sometimes it is interesting for the maintainers to set up a ignore list to the Binary-Artifacts and the Scorecard Team is already working on to provide this funcionality (https://github.com/ossf/scorecard/issues/1270). Although you can't set the Binary-Artifacts check to disconsider your binaries that are for test only, your project still score extremely high in the Binary-Artifacts check, which means that, until the feature of the ignoring list be released, the binaries won't be a huge problem on the risk tracking at the security dashboard neither to your overall score.

Fuzzing

Does the project use fuzzing tools, e.g. OSS-Fuzz?

We use several custom fuzzers and libfuzzer, but not (yet) OSS-Fuzz.

I've checked here running the Scorecard to the project itself and it correctly idenfied that your project is using fuzzing tools (the Score for this check was 10/10!)

Pinned-Dependencies

We have even more strict practice - all the dependencies are built from the source code. But our repository also contains an enormous amount of tests with Dockerfiles, where dependencies are not pinned.

I'm not sure if the OpenSSF Scorecard tool will be able to correctly get this information.

I'm still looking into a better understanding of this point regarding the specific characteristics of your project. Just for helping me in my analisys, I have a question: would it be possible to update the docker dependencies to be hash-pinned or it would depends on the docker project?

alexey-milovidov commented 2 years ago

Actually, I want to make all Docker dependencies be pinned, including those that are used for tests :)

azeemshaikh38 commented 2 years ago

Hi @alexey-milovidov, Scorecard maintainer here.

We use binary artifacts for some tests (examples of some binary data formats) and as part of the fuzzing corpus.

Like @joycebrum mentioned, we are looking into allowing maintainers to ignore this check for specific files. Good news is as of now, Scorecard only complains about tests/integration/test_catboost_evaluate/model/libcatboostmodel.so in this repository. So shouldn't cause too much pain.

We use several custom fuzzers and libfuzzer, but not (yet) OSS-Fuzz.

Looks like you are already using OSS-Fuzz? - https://github.com/google/oss-fuzz/blob/master/projects/clickhouse/project.yaml

all the dependencies are built from the source code.

This is interesting and I would love to understand this more to help us better our tooling. (1) when you pull in these submodules are you pulling them in from HEAD? (2) I can imagine us adding an exception to such submodules since these are "vendored" dependencies and maintainer does not really have control over these (see discussion). Is that the expected behavior you are looking for?

Actually, I want to make all Docker dependencies be pinned, including those that are used for tests :)

That's great! Scorecard is able to find these unpinned dependencies in tests and suggest a remediation for you :)

santrancisco commented 2 years ago

Hi all, I quickly tested this with a test github account personal access token.. i noticed in the doc that it needs read/write access to repo/public ... that itself is a security concern and I'm not too sure if the information here adds much value.. many of these checks can be done quickly via a few clicks in ui. For vulnerabilities, as per @alexey's comment, we have our own fuzzing, testing done, a bug bounty program and recent pentest. As of right now I see little value vs risk running this on every commit.

CleanShot 2022-09-21 at 08 30 25

regarding repo/public_repo access that mentioned in the doc :

CleanShot 2022-09-21 at 10 27 04

santrancisco commented 2 years ago

also, plz dont get me wrong, what you are doing is great and we can still leverage it for spot check every now and then with short live access token to make sure our settings are inline with industry best pratices 😉 I will play with it a little more and see if we can/should adopt this.

azeemshaikh38 commented 2 years ago

@santrancisco yes, that's true if you run Scorecard locally. The GitHub Action that this issue is proposing can run using GitHub's short-lived workflow tokens. See example.

santrancisco commented 2 years ago

Thanks for clarifying @azeemshaikh38 :)

ClickHouse / ClickHouse