keras-team / keras-core

A multi-backend implementation of the Keras API, with support for TensorFlow, JAX, and PyTorch.
Apache License 2.0
1.27k stars 115 forks source link

Pinned gha #877

Closed pnacht closed 11 months ago

pnacht commented 11 months ago

Fixes https://github.com/keras-team/keras-core/issues/876.

This PR hardens Keras Core's actions.yml workflow by hash-pinning its GitHub Actions. It also sets up dependabot to keep the Actions up-to-date.

kiukchung commented 11 months ago

@pnacht any thoughts on @jaraco's comments (https://github.com/pypa/setuptools/issues/4025#issuecomment-1704339687) (excerpt below)

My concern is that neither dependabot nor the reviewer will have special knowledge about the state of the Action repo when the PR is created. Of course, if the repo is compromised while the dependency is pinned, that will delay the adoption of that tag until the next refresh. If the Action is compromised shortly before the PR is sent, however, dependabot will pin the exploited version (and likely extend the duration of exposure).

Thinking about it statistically, the average expected exposure will be unimproved over the unpinned behavior unless additional investment is put into validating a trusted pin.

Moreover, it introduces an additional attack vector in Dependabot+Github vs. simply trusting Github. Imagine that Dependabot is compromised never to update pins if an exploit has been pinned. Moreover, since an attacker has visibility to dependabot configs, they could easily optimize the timing of their attack to take advantage of the known pinning periods of target projects to amplify their attack.

Seems like a valid point and if so, hash-pinning wouldn't make things "more" secure?

pnacht commented 11 months ago

Ah yes, thanks for bringing that up. This point was also brought up in https://github.com/python/cpython/issues/109110.

My concern is that neither dependabot nor the reviewer will have special knowledge about the state of the Action repo when the PR is created. Of course, if the repo is compromised while the dependency is pinned, that will delay the adoption of that tag until the next refresh. If the Action is compromised shortly before the PR is sent, however, dependabot will pin the exploited version (and likely extend the duration of exposure).

This point is mainly true if the project does not activate dependabot security updates. With security updates, dependabot will immediately send "out-of-season" PRs to fix any dependency with a known vulnerability.

Which reminds me: please activate security updates in Keras if you haven't already, regardless of what's decided here!

  1. Go to Settings > Code security & analysis
  2. Enable "Dependabot Alerts" and "Dependabot Security Updates". Neither of these require a dependabot.yml file, you just have to enable them in the settings.

Now, back to the discussion... due to these "out-of-season" PRs, the window of exposure to the vulnerable version isn't necessarily a month, but just the time it takes to merge that security update PR.

I did not raise this point in that conversation with @jaraco because (even though the discussion is in pypa/setuptools) we were effectively talking about making this change in jaraco/skeleton, a "template" repository used by many Python projects. Making this change in the skeleton could indeed endanger projects that depend on the skeleton but don't activate security updates.

[Moving jaraco's arguments around a bit below...]

Thinking about it statistically, the average expected exposure will be unimproved over the unpinned behavior unless additional investment is put into validating a trusted pin. [...] Moreover, since an attacker has visibility to dependabot configs, they could easily optimize the timing of their attack to take advantage of the known pinning periods of target projects to amplify their attack.

Indeed, if an attacker is able to time their attack to publish a malicious dependency version right before the target projects get their updates, those projects are likely to receive a PR for the malicious version. The projects will then be exposed to that malicious dependency until they receive and merge the security update PR.

But for an attacker to time their attack, they basically need full control of the repository. Or at least sufficient access to add malicious code and create a malicious release at a time of their choosing. Indeed, if this happens, it's basically game over and there really isn't anything a dependent project can do unless they do a security audit before every version update.

However, there are other avenues of attack that don't require taking over a dependency's repository. For example, one might simply submit a PR with a subtle vulnerability (malicious or accidental) that goes undetected by the maintainers. Even if the PR was intentionally malicious, the author has effectively no control over when the vulnerability will impact the target projects since they need to wait until maintainers decide to release a new version.

And even if the attacker can time their attack, Keras may not be their primary target. In which case they may time their attack to maximize the damage on that target instead (I'm sure many crypto projects use actions/checkout and codecov/codecov-action!), not Keras.

It's therefore entirely possible (or even likely, with a monthly update schedule) that the new version will be released long before Keras gets the update. In that time, the vulnerability may be discovered, in which case Keras won't even receive the update PR and therefore won't be exposed at all.

As I see it, hash-pinning really doesn't offer any real benefit in the worst-case situation where a dependency has been taken over by a malicious actor targeting Keras:

However, hash-pinning can have significant benefit in all other cases where the actor doesn't have "full control" of the repository and therefore can't control when the malicious version comes out (or cases that simply involve "innocent" bugs):

Moreover, it introduces an additional attack vector in Dependabot+Github vs. simply trusting Github. Imagine that Dependabot is compromised never to update pins if an exploit has been pinned.

I'll note here that the current attack vector for Keras is currently GitHub+CodeCov+dorny. (Is @dorny a trusted party?)

But yes, this does introduce an additional vector: dependabot (owned by GitHub, but likely managed by a different team than the one responsible for the actions org).

In my opinion, the calculus here comes down to whether you think the risk that dependabot will be compromised exceeds the risk that one of your Actions will have a vulnerability (malicious or accidental).

fchollet commented 11 months ago

Thanks for the detailed analysis. Our conclusion is that we don't need hash pinning at this time.

pnacht commented 11 months ago

Sure thing, thanks for your time! Let me know if you'd like me to set up dependabot to let you know about major-version updates to Actions (for example, actions/checkout has recently released @v4).