package-url / purl-spec

A minimal specification for purl aka. a package "mostly universal" URL, join the discussion at https://gitter.im/package-url/Lobby
https://github.com/package-url/purl-spec
Other
688 stars 159 forks source link

Debian Repository URL Format should be clarified #306

Open captn3m0 opened 4 months ago

captn3m0 commented 4 months ago

It is unclear what the repository_url should be for a Debian Package.

Debian sources uses a special Repository Format https://wiki.debian.org/DebianRepository/Format, which uses:

deb uri distribution [component1] [component2] [...]

as an example line:

deb https://deb.debian.org/debian stable main contrib non-free.

This internally translates to a Release file, which lists several Packages.(gz|xz) files for various architectures.

It is unclear which of these URLs should a PURL use.

https://deb.debian.org/debian is ambiguous, the repository_url should point out enough details to let the user reconstruct the actual Packages.gz file used for the install. It needs to include:

  1. source v/s binary package
  2. Architecture
  3. And enough URL information to reconstruct it back.

But there's no standard representation for such a URL in Debian.

matt-phylum commented 4 months ago

repository_url itself doesn't need to have enough information to reconstruct the link. There is already a qualifier for the architecture, which has a special value "source" for indicating that it's a source package instead of a binary package. But it looks like the qualifiers for release (stable) and repository (main or contrib or non-free) is missing.

captn3m0 commented 4 months ago

Good point on arch/source being covered already.

But it looks like the qualifiers for release (stable) and repository (main or contrib or non-free) is missing. Debian calls them "distribution" and "component.

A line like deb [arch=amd64] https://apt.bell-sw.com/ stable main results in a discovery for the following URLs:

With the last two listing down the relevant components and Suite/codenames as well. However, since the Release files can contain multiple components, they aren't enough on their own. We could perhaps use https://apt.bell-sw.com/dists/stable/main/?

How about using a separate package_source to just provide a minimal source format (with the limitation that there can only be a single "component", and there should be no arch as per https://wiki.debian.org/Multiarch/HOWTO#Setting_up_apt_sources)

package_source=deb https://deb.debian.org/debian stable main
package_source=https://apt.bell-sw.com/ stable main

(with spaces encoded properly as %20 in the final PURL).

matt-phylum commented 4 months ago

If resolving packages from deb [arch=amd64] https://apt.bell-sw.com/ stable main results in requests to https://apt.bell-sw.com/dists/stable/Release, I don't think it makes sense for the repository_url to be https://apt.bell-sw.com/dists/stable/main/. It could maybe be https://apt.bell-sw.com/dists/stable/, but that seems like it would be weird since the native tools use https://apt.bell-sw.com/.

Why not something like pkg:deb/whatever?architecture=amd64&release=stable&repository=main&repository_url=https://deb.debian.org/debian/? Using package_source would require tools to implement parsers and formatters for apt source lines.

captn3m0 commented 4 months ago

I'd prefer using distribution+component (same wording as what debian suggests) instead of release+repository:

pkg:deb/whatever?architecture=amd64&distribution=stable&component=main&repository_url=https://deb.debian.org/debian/

There is some overlap and confusion between the distro and distribution key obviously, but even in default installations, there's cases where both are needed:

http://deb.debian.org/debian buster-updates main means a package could have distro=buster&distribution=buster-updates