ossf / scorecard

OpenSSF Scorecard - Security health metrics for Open Source
https://scorecard.dev
Apache License 2.0
4.62k stars 503 forks source link

Feature: Track SBOM creation and their quality (via sbom-scorecard) #3043

Closed justinabrahms closed 1 year ago

justinabrahms commented 1 year ago

Section: Build Risk Assessment Points:

For our initial version, we will look in GitHub releases for files that match these patterns, as suggested by their communities.

Alternatives considered

Context

This issue is intended to replace the work in #2605, because the main poster isn't engaged and I want to be able to edit the issue description.

Known action items

We're going to seek feedback from the SBOM Everywhere and Repos groups

SBOM everywhere meets every other Tuesday @ 11:05am EST. The invite is available on the OpenSSF Community Calendar. Repos group meets on Zoom every other Wednesday, alternating between EMEA (13:00 UTC) and APAC-friendly times (22:00 UTC).

lucasgonze commented 1 year ago

I have researched SBOM detection for Github Actions here: https://gist.github.com/lucasgonze/b84a2c52c697bf8f686a005080c369dd

justinabrahms commented 1 year ago

Open questions for the working groups:

evverx commented 1 year ago

I'm not sure I understand what this check is supposed to accomplish. Why would vendors embedding open-source components into their products, devices and so on use and trust SBOMs generated upstream? What would SBOMs for low-level libraries (usually built and shipped by distributions far from upstream projects) look like? Who is going to be responsible for VEX stuff?

justinabrahms commented 1 year ago

VEX, in my mind, is out of scope for this thread.

The SBOM for a low level library seems identical to a higher level library. In some languages, we may have dependency resolution concerns which will be solved at higher levels, but not all. In that case, I'd expect the SBOM to not include the relevant dependencies if they're not specified.

I don't understand your trust concern. People would trust the sbom in the same way they trust the source code. SLSA or similar may help with asserting the provenance of the sbom.. but again.. that doesn't seem in scope?

What am I missing?

evverx commented 1 year ago

VEX, in my mind, is out of scope for this thread.

I kind of agree. But recently I happened to explain that an upstream bug tracker of a project using a certain runtime library at runtime isn't the best place to dump the output of vulnerability scanners and that it's generally their responsibility to figure out how CVEs in that dependency affect their particular device. "Bug reports" like that aren't reported very often (because SBOMs aren't widely used) but if SBOMs are kept somehow upstream it makes it much easier to generate noise like that. Anyway I don't think that SBOMs in a vacuum are particularly useful so even if something is technically out of scope (but closely related) it should be discussed too.

The SBOM for a low level library seems identical to a higher level library

I think I meant to say "low-level projects" instead of "low-level libraries". Sorry. Upstream projects like that generally have no control over how they are built and distributed so it isn't clear how scorecard can figure out whether they have SBOMs and how good they are. If they have SBOMs in some places it doesn't mean that they are applicable everywhere because they can be configured, built and shipped in a lot of different ways.

People would trust the sbom in the same way they trust the source code

That's concerning :-) What I was trying to say is that actual SBOMs describing how certain products are actually assembled don't have to match upstream SBOMs and it would be weird to rely on them. Vendors can't claim that they use some dependencies just because some upstream SBOMs told them so. Their SBOMs should match their products.

evverx commented 1 year ago

I took a look at all the issues related to SBOMs and it seems to me that there is no clear rationale behind this check.

we can scan the SBOMs to enrich the overall scorecard score using 3rd-party packages

I'm not sure I understand what it means.

tells potential users that the developers are trying to help by providing this info

That's nice but it isn't clear how exactly it helps. It's also not clear if there are any actual consumers who have the expertise to actually consume that and what their actual use cases are. If upstream SBOMs don't actually cover them it's effectively useless.

Does the software have an SBOM? (3 points)

It just assumes that it's a good idea.

As far as I can tell this SBOM movement implicitly targets certain ecosystems with certain registries/package managers and maybe those implicit assumptions should be spelled out first because it seems to me that the first thing this check should do is to make sure that it analyzes projects where it makes any sense. I think its limitations should be clearly documented too. More generally before going forward with that https://github.com/ossf/sbom-everywhere/pull/27/files#diff-84d5c6f22596cd94bba61aea654d191da533d7016ef2682e2805ddcbdfefc397R11 seems like a good idea.

evverx commented 1 year ago

Do we recommend that libraries have sboms? What about if they have unresolved dependencies?

At least JS libraries appear to have been discussed in https://github.com/ossf/sbom-everywhere/issues/24#issuecomment-1396467716. To judge from https://github.com/ossf/sbom-everywhere/issues/24#issuecomment-1397718749 it isn't clear what the point of generating SBOMs upstream is

So, to repeat back, my packages could add an SBOM stating "nothing is bundled with this library" and that would be an improvement for the ecosystem, versus tooling looking at the bundledDependencies field (which tells you the same information)?

lucasgonze commented 1 year ago

it isn't clear what the point of generating SBOMs upstream is

Generating a good SBOM can take a fair amount of time and trouble when there is vendored code (which often happens with C/C++) or when the package is multi-language. Doing it well - once - is naturally the job of the developer. Otherwise, every downstream user is redoing the work.

It just assumes that (SBOM)'s a good idea.

To figure that out, we need to talk about use cases for SBOMs.

If the use case is evaluating license risk before incorporating a new component, the SBOM needs to be in the release. That's true even for Javascript libraries in NPM.

For other use cases, we need to evaluate them one by one.

evverx commented 1 year ago

Doing it well - once - is naturally the job of the developer

I'm not sure I understand how the developer would know how downstream consumers consume their project. If we're talking about C for example it's possible to have a build dependency required for supporting a feature but since at runtime it's loaded with dlopen what matters in the end is whether that dependency is actually deployed on an actual device and nobody apart from vendors shipping their stuff knows what they do.

Otherwise, every downstream user is redoing the work.

Right. That's what vendors are supposed to do to get SBOMs that actually make sense.

To figure that out, we need to talk about use cases for SBOMs

Agreed. I haven't seen any actual consumers with actual use cases yet though.

If the use case is evaluating license risk before incorporating a new component, the SBOM needs to be in the release

Upstream SBOMs are unreliable in terms of getting the list of actual components. How would anyone evaluate anything based on that?

lucasgonze commented 1 year ago

Agreed. I haven't seen any actual consumers with actual use cases yet though.

Are you generally not persuaded that SBOMs are valuable?

evverx commented 1 year ago

Quite the opposite. I think actual SBOMs covering actual use cases actual vendors/manufacturers/... have make sense. I'm trying to figure out how SBOMs in a vacuum (where VEX stuff isn't important for some reason for example) can help in practice though.

evverx commented 1 year ago

Just to expand on

I haven't seen any actual consumers with actual use cases yet though.

I meant that I haven't seen actual consumers here. I have seen them in general :-)

justinabrahms commented 1 year ago

Just to expand on

I haven't seen any actual consumers with actual use cases yet though.

I meant that I haven't seen actual consumers here. I have seen them in general :-)

eBay, my previous company, would be an actual consumer. We were building out functionality to import SBOMs into a graph (guac) to do dependency, license, and risk analysis.

evverx commented 1 year ago

I wonder what languages/ecosystems were analyzed this way? Why was it decided that that graph reflected reality?

github-actions[bot] commented 1 year ago

Stale issue message