package-url / purl-spec

A minimal specification for purl aka. a package "mostly universal" URL, join the discussion at https://gitter.im/package-url/Lobby
https://github.com/package-url/purl-spec
Other
696 stars 161 forks source link

[Proposal] deno, jsr, and esm.sh #302

Open prabhu opened 6 months ago

prabhu commented 6 months ago

In deno runtime, it is possible to import packages directly from the https url.

Examples:

https://deno.land/x/fresh@1.6.1/dev.ts
https://esm.sh/twind@0.16.19
https://esm.sh/axios@1.3.2?target=es2022
https://esm.sh/*@preact/signals-core@1.5.0
jsr:@std/toml
https://deno.land/std@0.177.0
https://unpkg.com/xyz-lib@v0.9.0/lib.ts

I think these warrant new types deno, jsr, esmsh, and unpkg, since the transformed code is often different from the versions published on npm.

purls for the above examples might look like below:

pkg:deno/fresh@1.6.1#dev.ts
pkg:esmsh/twind@0.16.19
pkg:esmsh/axios@1.3.2?target=es2022
pkg:esmsh/%40preact/signals-core@1.5.0 (*@ becomes @ which is %40)
pkg:jsr/%40std/toml
pkg:deno/std@0.177.0 (deno/std and jsr/@std are treated differently even though they could be the same code)
pkg:unpkg/xyz-lib@v0.9.0#lib.ts
matt-phylum commented 6 months ago

Is there a difference between pkg:deno and pkg:esmsh? Deno can potentially load from any host, and none of the host names here are valid PURL package types. In the documentation, they show importing from x.nest.land, which isn't representable by this scheme. It'd probably be better to do pkg:deno/twind@0.16.19?repository_url=https://esm.sh?

pkg:unpkg is particularly problematic because unpkg is a well known Javascript CDN and here it means something else.

prabhu commented 6 months ago

esm.sh is a special service that seems to transform the input with tree-shaking, bundling, etc. So I would recommend treating it as a dedicated package type to keep things easy.

Example:

Given the code below:

import React from "https://esm.sh/react@18.2.0";

the esm.sh service seems to serve the below:

/* esm.sh - react@18.2.0 */
export * from "https://esm.sh/stable/react@18.2.0/es2022/react.mjs";
export { default } from "https://esm.sh/stable/react@18.2.0/es2022/react.mjs";

Upon inspecting the first url, it is clear that it is a custom transformed version of the npm package that starts with the following comment representing the bundling tool and settings used.

/* esm.sh - esbuild bundle(react@18.2.0) es2022 production */
matt-phylum commented 6 months ago

Is esm.sh any different in terms of usage? It seems like the dependencies are expressed and resolved and used the same as far as the runtime is concerned. It has the same thing as unpkg going on where it just reexposes mutated versions of packages that are better known by another name on npm.

Maybe pkg:deno is wrong. esm.sh is just ecmascript modules being loaded over HTTP, and so is deno.land. Both are usable from a browser as well as Deno, although if you import deno.land modules they're likely to depend on Deno and not work in a browser environment. In which case, it's probably better to use pkg:esm.

There's another problem here: this defines a package-oriented ecmascript module specification system when there is no such package-oriented ecmascript module loading system. Tools like cdxgen would need to have an understanding of every package source to reverse engineer packages from the URLs. For example, https://deno.land/x/fresh@1.6.1/dev.ts uses the pattern https://deno.land/x/<PACKAGE>@<VERSION>/<SUBPATH> with an /x/ component, and https://cdn.skypack.dev/pin/clipboard@v2.0.6-eJjsV6JYCJOnOsq3YfEc/min/clipboard.js uses the pattern https://cdn.skypack.dev/pin/<PACKAGE>@v<VERSION>-<HASH>/min/<SUBPATH> (there are at least two more patterns for skypack). Modules that are not contained in packages on well-known CDNs would have to be ignored.

prabhu commented 6 months ago

Perhaps, we use npm with a URL qualifier for non-mutating CDN-only services? Those services like esm.sh, where we definitely know that the input package got modified, could get their own strong types based on parts of their domain name? CycloneDX specification supports a range of external references, such as distribution-intake and distribution. With these references types, we can capture the source and CDN URLs and leave the parsing to the downstream tools for any unknown CDN services.

matt-phylum commented 6 months ago

Creating more types seems like it just creates more types. What is the value of pkg:esmsh pkg:skypack pkg:deno pkg:jsr pkg:unpkg pkg:cdnjs pkg:jsdelivr?

prabhu commented 6 months ago

JavaScript ecosystem is messy. I am only a messenger.

matt-phylum commented 6 months ago

Isn't it basically the same as Go? Go packages can be hosted on different services but hosting on GitHub or GitLab or cs.opensource.google doesn't change the package type.

prabhu commented 6 months ago

It is not. deno and jsr can host both typescript and javascript code. npm can host only javascript with typescript code usually transpiled before publishing. esm.sh takes an input package and transforms on the fly to create a new package.

If a vulnerability is filed against a deno.land package, it may or may not affect the npm version. If the bundler tool used by esm.sh injects some vulnerable code, it may not affect the original deno/npm/jsr package.

matt-phylum commented 6 months ago

NPM can host typescript code just fine. NPM doesn't care if the package is runnable at all. I've seen NPM packages that contain only video files.

The difference between "deno" esm imports (which includes browsers) and NPM is how the code is obtained. If you use an es module HTTP import, the file is loaded from an arbitrary HTTP location. If you use a path-based es module import or require statement, the code has to have already gotten there somehow, and if it's third-party code it will have likely come from NPM. In either case, you can use a bundler to build the code into your application.

If a vulnerability is filed against a deno.land package, it may or may not affect the npm version. If the bundler tool used by esm.sh injects some vulnerable code, it may not affect the original deno/npm/jsr package.

deno.land and npm are not the same packages. There is no concept of an npm version. deno.land packages are tagged Git commits, similar to Go, except that only GitHub is supported.

esm.sh, skypack, jspm.io, jsdelivr, jsr, unpkg, cdnjs all can and likely do modify the files from the original npm package before sending them. They are a distinct distribution of a package, but they are not a unique package and they are not a unique registries nor are they retrieved and loaded differently so they should not be unique package types.

There may be value in using pkg:esm for generic esm imports (still problematic because esm is not package-oriented) and detecting when pkg:npm is applicable because of a CDN (risky and expensive in terms of accuracy, consistency, maintainability since there are so many and they change over time) with a qualifier that identifies a CDN service that is modifying the package in addition to the file path that is actually being retrieved. However, there may be issues with using the standard URL qualifiers since you can't normally just retrieve the package from one of these services and the subpath is important.

adamgreg commented 1 month ago

I currently have to run my own script to build SBOM content for my Deno dependencies. I can trace imports from https://esm.sh back to an npm purl. I can trace imports from https://deno.land/x back to a github purl after querying https://apiland.deno.dev .

However, I don't have a good solution for dependencies from https://jsr.io , which is a completely separate registry. Where there is provenance in Sigstore Rekor I can link back to a github purl, but I would be keen on a jsr type.