ossf / sbom-everywhere

Improve Software Bill of Materials (SBOM) tooling and training to encourage adoption
Apache License 2.0
69 stars 24 forks source link

SBOM standards for ESM Modules in the JS Ecosystem #24

Open Grunet opened 1 year ago

Grunet commented 1 year ago

There are several different ways to publish a module in the JS ecosystem for re-use elsewhere.

Probably the most common by volume these days is modules published to npm in the CommonJS (CJS) format. These tend to come with a package.json file that can hold structured metadata about the package.

However the current web standard for publishing modules is in the Ecmascript Modules (ESM) format. And these don't necessarily come with a dedicated separate file that holds structured metadata about the package afaik (e.g. any package published to Deno's 3rd party package registry at https://deno.land/x)

It would be nice if there were a (web or otherwise) standard way to provide structured metadata about these packages, so that SBOM information could be included.

Grunet commented 1 year ago

*Wasn't super sure what to write so let me know if anything needs more details

stevespringett commented 1 year ago

To my knowledge, ESM doesn't have a packaging format, its just a plain Javascript file with exports.

Wouldn't something like pkg:generic/react@17.0.2?download_url=https://esm.sh/react@17.0.2 cover these cases?

Grunet commented 1 year ago

Yeah for anything using a "trusted" ESM converter CDN like esm.sh, jspm.io, or skypack.dev, I think that would work for anything they're converting that already has structured metadata.

But at least in Deno's case you can publish modules totally independent of those. Like I made this tiny one https://deno.land/x/sendbeacon_polyfill@0.0.1 without publishing it to npm or anywhere else (source is at https://github.com/Grunet/sendbeacon-polyfill) and then directly used that in a different project.

I don't really know where I would've put provenance metadata or things like that, or how my other project could've checked on those.

stevespringett commented 1 year ago

Support for Deno modules may make sense. Perhaps something like:

pkg:deno/sendbeacon_polyfill@0.0.1

However, I'm really puzzled about their design decision here, specifically:

It caches releases of open-source modules stored on GitHub and serves them at an easy-to-remember domain.

Deno can import modules from any location on the web, like GitHub, a personal webserver, or a CDN like esm.sh, Skypack, jspm.io or jsDelivr.

Name squatting is not allowed on the deno.land/x/. If you feel that a module is not currently usable, has not been legitimately under development for more than 90 days, and you have a concrete proposal to publish a well-maintained module in its place, please contact support.

Deno therefore hides provenance and potentially allows a named package to change.

I'd like to get some additional community feedback.

ljharb commented 1 year ago

ESM absolutely has a packaging format - normal npm packages can contain CJS (which can be required or imported), or ESM (which can only be imported), or both.

Other than that, and "a URL that serves a JS file", what other packaging formats with fixed contents exist?

Grunet commented 1 year ago

Admittedly I haven't really thought it through, but I thought it would be nice if there was some dedicated file tools could look for for the extra metadata.

Deno already sort of does that when looking for type declaration files for a JS-only ESM module (per https://deno.land/manual@v1.29.4/advanced/typescript/types#providing-types-when-importing)

It seems like if there was some general analog to this it could be used for SBOM data too

ljharb commented 1 year ago

I think the most important ecosystem by scale - across all languages - is npm. If you cover npm, you've covered many, many nines of JS code, whether it's CJS or ESM, and i'd bet most URL-accessed ES Modules are served out of an npm package anyways.

fwiw I maintain a significant percentage of npm's downloads, and I'd be happy to add SBOMs to my packages if that would add value - but I'm still waiting for both a demonstration of that value, and, a (JS-based) tool that makes it easy to validate and create the required metadata.

stevespringett commented 1 year ago

@ljharb Historically, most SBOM adopters have been generating them using https://www.npmjs.com/package/@cyclonedx/bom for NPM projects. That project is mostly deprecated and is superseded by https://www.npmjs.com/package/@cyclonedx/cyclonedx-npm. There's also webpack plugins available and we're working on yarn and pnpm as well. https://www.npmjs.com/package/@appthreat/cdxgen is also an option.

To your question on value... It really depends on the type of analysis someone wants to perform on the SBOM. For projects that use one of your libraries, and look at the SBOM, the dependencies listed may or may not be applicable to their application. When they pull in one of your libraries, there's no guarantee they will get the same versions of the same transitive components. IMO, the biggest advantage for NPM is being able to describe bundled dependencies. If one of your libraries bundles another library, this is a perfect reason why someone would value an SBOM from NPM. Likewise, if your libraries do not bundle additional libraries, that is just as important and just as valuable in an SBOM.

ljharb commented 1 year ago

My projects don't bundle any deps (virtually nobody does in a package this but npm itself, i'm pretty sure), and they don't ship a lockfile (also something virtually no packages do, since it's user-hostile by preventing deduping and free-flowing security/bug updates). What would an SBOM add in this case, where the lack of guarantees there is a very, very intentional feature of the ecosystem?

stevespringett commented 1 year ago

What would an SBOM add in this case, where the lack of guarantees there is a very, very intentional feature of the ecosystem?

Not much. From a dependency resolution and inventory perspective, SBOM for many libraries does not make a ton of sense.

However, being able to state in an SBOM that a library either includes or does not include other libraries provides a lot of value. The NTIA Minimum Elements allows vendors to specify known-unknowns. In other words, vendors being able to state that I do not know what else is in this thing. For NPM libraries that produce SBOMs, those same vendors would be able to state that either a) nothing else is in that library, or b) that something is in that library, both of which are better than stating that they do not know.

ljharb commented 1 year ago

So, to repeat back, my packages could add an SBOM stating "nothing is bundled with this library" and that would be an improvement for the ecosystem, versus tooling looking at the bundledDependencies field (which tells you the same information)?