slsa-framework / slsa

Supply-chain Levels for Software Artifacts
https://slsa.dev
Other
1.48k stars 212 forks source link

Unclear difference between dependency and package threats #1039

Open NVolcz opened 3 months ago

NVolcz commented 3 months ago

The difference between "(D): Use of compromised dependency" and "(H): Use of compromised package" is unclear. There is no explanation of what make using a package as a dependency different from using it in any other form or way.

I suggest that we either clarify the difference or merge them into one threat.

https://slsa.dev/spec/v1.0/threats-overview image

Below are some parts out of a conversation on Slack:

arewm:

The presence of signed provenance, trusting the signing entity, and its reference to a specific package (i.e. by immutable reference) can be used. Either during consumption of the generated package or consumption of dependencies for the build

nvolcz:

Thanks! That makes sense! It sound like it is the same threat but mitigated at different times in the process (edited) @

arewm:

but mitigated at different times in the process

Yes, exactly. The threat model has to have a specific reference perspective. It is presented from the lifecycle of a single package/artifact including its dependencies and its consumers. The same rules can apply to both, but they are different from the perspective of a given artifact. (H) would always exist unless you generate a package/artifact which is never used. (H) for one artifact is (D) for its consumer.

MarkLodato commented 2 months ago

Agreed on the confusion. Thanks for pointing it out. We should address this.

(D) is really all of the other threats (A)-(H) recursively, from a different frame of reference. It is saying that a given package could be compromised by doing (A)-(H) from one of its dependencies.

Maybe an example would help. Consider event-stream, which was compromised through its dependency flatmap-stream. The attacker snuck in code into the NPM release of flatmap-stream that was not present in the source on GitHub. So that's (F) for flatmap-stream and (D) for event-stream. Does that help?

MarkLodato commented 2 months ago

I also agree that the name of (H) could be improved. We originally said "Trick user into using bad package", which sounded limited to typosquatting, but we really also intend things like modification in transit (e.g. MitM). The current name "Use compromised package" sounds recursive like (D), but that is not the intention.

laurentsimon commented 2 months ago

I also think some of the build threats are not relevant where they should be. For example, imo F and G should be under a "publication threat" rather than build threat.

unrelated: Why do we have B about compromising a source repo, but we don't have the equivalent to compromising a builder?

MarkLodato commented 2 months ago

For example, imo F and G should be under a "publication threat" rather than build threat.

I don't think so. The idea is that we have three clusters of threats:

Package threats are not really distinct from build threats. The outcome and solutions are largely the same, so it's helpful to bundle them together.

(I think we should do a better job of explaining this though.)

Why do we have B about compromising a source repo, but we don't have the equivalent to compromising a builder?

We do, see "Compromise build platform admin" under (E).

MarkLodato commented 2 months ago

I just sent out #1046. Please take a look to see if that's on the right track. I don't think it fully addresses this issue since it doesn't touch (H), but I wanted to keep the PR a manageable size.

laurentsimon commented 2 months ago

For example, imo F and G should be under a "publication threat" rather than build threat.

I don't think so. The idea is that we have three clusters of threats:

  • "Build threats" = the package that you receive does not match the source of truth
  • "Source threats" = the source of truth contains unauthorized code
  • "Dependency threats" = a dependency is compromised, which in turn compromises this one

Package threats are not really distinct from build threats. The outcome and solutions are largely the same, so it's helpful to bundle them together.

Not sure I agree with this. The process of publication can change a package, but an artifact is not a package until it's published. The process of publishing binds an artifact to a package URI. A build is just a blob of data with no meaning until it's bound to a package. The binding to me refers to some sort of policy / expectation verification which is somewhat unrelated to the build process itself. There is so much more to publication than build, including delegation of this verification policy (VSA), policy definition, etc...

I would find it more intuitive if source threats were addressed by source track, build threats by build track, etc

(I think we should do a better job of explaining this though.)

Why do we have B about compromising a source repo, but we don't have the equivalent to compromising a builder?

We do, see "Compromise build platform admin" under (E).

+1, overlooked this.

NVolcz commented 2 months ago

I don't think so. The idea is that we have three clusters of threats:

* "Build threats" = the package that you receive does not match the source of truth
* "Source threats" = the source of truth contains unauthorized code
* "Dependency threats" = a dependency is compromised, which in turn compromises this one

Your description of the threat clusters is clear and helpful. However, I believe we may need to clarify how to categorize threats that fall at the edges between these clusters.

For instance, consider the threat (C) "Build from modified source." This scenario could be interpreted as a build threat since the package received does not match the source of truth. However, it's not necessarily that the source of truth itself is compromised. So, the question arises: is it a source or a build threat?

Similarly, threats (H) and (D) are positioned at the edge between the group of build threats and consumption of packages. When intercepting the download of a package, is it a build threat or does it a problem for the consumer of the package?

I believe addressing these edge cases will provide a clearer understanding of how to categorize threats within our framework.

MarkLodato commented 2 months ago

There is so much more to publication than build, including delegation of this verification policy (VSA), policy definition, etc...

Yes, all of this is part of the "build track" and "build threats". Perhaps it's just a naming issue. Would "binary provenance threats" or "build & packaging threats" or "publication threats" resonate better?

I would find it more intuitive if source threats were addressed by source track, build threats by build track, etc

That is the intent. All of these threats are intended to be solved by the build track.

The main challenge is the verification story. Right now the "requirements" for build track are just about publishing provenance and we have this awkward separate page about verification. I want that to all be put together so that the entire track is about end-to-end guarantees. But that's a larger issue.

For instance, consider the threat (C) "Build from modified source." This scenario could be interpreted as a build threat since the package received does not match the source of truth. However, it's not necessarily that the source of truth itself is compromised. So, the question arises: is it a source or a build threat?

Yes, I completely agree! (C) can go either way:

I believe addressing these edge cases will provide a clearer understanding of how to categorize threats within our framework.

👍👍👍

kpk47 commented 2 months ago

There is so much more to publication than build, including delegation of this verification policy (VSA), policy definition, etc...

Yes, all of this is part of the "build track" and "build threats". Perhaps it's just a naming issue. Would "binary provenance threats" or "build & packaging threats" or "publication threats" resonate better?

This comment feels related to #1041. I don't have a good solution since more precise names tend to be longer and a bit awkward. FWIW, I like "binary provenance threats" best of the listed options.

For instance, consider the threat (C) "Build from modified source." This scenario could be interpreted as a build threat since the package received does not match the source of truth. However, it's not necessarily that the source of truth itself is compromised. So, the question arises: is it a source or a build threat?

Yes, I completely agree! (C) can go either way:

* We originally had it as part of "build threats" for the reason you said (matching source of truth).

* For v1.0  ([Update threats.md for v1.0 #732](https://github.com/slsa-framework/slsa/pull/732)), we moved it to "source threats" to refocus "build threats" to just be about reporting accurate provenance, and "source threats" kind of being anything about source. But I agree that it's awkward and I don't love it.

I see (A) as an attacker making unauthorized changes to the official source and (C) as building from unofficial source. I think we're having trouble classifying (C) because we're assuming some notion of official source that isn't a part of the SLSA model.

The shape of my thinking is roughly:

(C) is an attack on that last mapping and only makes sense in a context where that mapping is explicit and maintained. In such a context, I might argue that it's an attack on the publication process rather than the build process or source repo.

It also isn't clear to me how much this classification matters since perhaps this ambiguity is evidence that the source/build/dependencies/etc classification is breaking down.