package-url / purl-spec

A minimal specification for purl aka. a package "mostly universal" URL, join the discussion at https://gitter.im/package-url/Lobby
https://github.com/package-url/purl-spec
Other
673 stars 157 forks source link

Expected PURL for Conan - specifically namespace field #94

Open steve-thousand opened 3 years ago

steve-thousand commented 3 years ago

I am looking for a suggested format for Conan package urls.

Specifically I am trying to solve the problem of what to put in the namespace field for a purl, given that there are at least 2 major repos for Conan, https://bintray.com/conan/conan-center and https://bintray.com/bincrafters/public-conan, and conan also has an optional pair of attributes called user and channel that indicate "a forked recipe from the community with changes specific for your company". I figure all of these are significant information for identifying a package, and they all may qualify as "namespace" information.

One example package is 7zip/19.00@_/_. _/_ seems to be a stand-in for no given user/channel. And given that _/_ are the default user/channel and conan-center is the default repo, maybe they don't need to be included in the purl?

pkg:conan/7zip@19.00

That may be simple enough, but another - more complicated - example is cctz/2.3@bincrafters/stable from the bincrafters repo. bincrafters/stable seems to be the user and channel information, and since it appears to be tied to company info, maybe it belongs in the namespace field?

//user/channel in qualifier string?
pkg:conan/cctz@2.2?user=bincrafters&channel=stable

//user/channel as namespace?
pkg:conan/bincrafters%2Fstable/cctz@2.2

Also it is not entirely clear to me if the bincrafters user is a 100% accurate indication that the repo this came from was bincrafters. Is it possible that a reference with @bincrafters/stable could live in either conan-center or some other third party repo? In which case including bincrafters repo information in the purl might be helpful.

//in the namespace? as either a repo shorthand name or the full url?
pkg:conan/bincrafters/cctz@2.2?user=bincrafters&channel=stable
pkg:conan/https%3A%2F%2Fbintray.com%2Fbincrafters%2Fpublic-conan/cctz@2.2?user=bincrafters&channel=stable

//or in the qualifier string?
pkg:conan/cctz@2.2?user=bincrafters&channel=stable&repo=https%3A%2F%2Fbintray.com%2Fbincrafters%2Fpublic-conan
pkg:conan/cctz@2.2?user=bincrafters&channel=stable&repo=bincrafters
pkg:conan/bincrafters%2Fstable/cctz@2.2&repo=https%3A%2F%2Fbintray.com%2Fbincrafters%2Fpublic-conan
pkg:conan/bincrafters%2Fstable/cctz@2.2&repo=bincrafters

It's worth noting as well that there are a few other important dimensions of package identity, such as recipe revision Id, package Id, package revision Id, but all of those seem like the sort of thing that belongs in the qualifier string, though I wouldn't mind a ruling on that either.

stevespringett commented 3 years ago

The type definition for conan hasn't been defined yet. The namespace isn't confined to a single label. It can have '/'.

So it's possible a namespace could include the user and channel separated by a '/' followed by the name of the component.

Also, the namespace cannot include encoded '/' that are not part of the namespace nor can it include URLs. So the namespace example with https://.... would be invalid.

Namespaces are defined as:

  • The optional namespace contains zero or more segments, separated by slash '/'
  • Leading and trailing slashes '/' are not significant and should be stripped in the canonical form. They are not part of the namespace
  • Each namespace segment must be a percent-encoded string
  • When percent-decoded, a segment:
    • must not contain a '/'
    • must not be empty
  • A URL host or Authority must NOT be used as a namespace. Use instead a repository_url qualifier. Note however that for some types, the namespace may look like a host.

So one option would be to have a purl such as:

pkg:conan/bincrafters/stable/cctz@2.2

In this example, the namespace bincrafters/stable has two segments, the user and channel.

Anyway, this is just one example. We obviously need to account for all possible scenarios for Conan which, honestly, I'm not too familiar with.

steve-thousand commented 3 years ago

I have an update on this, related to some feedback I got on a somewhat similar conan github issue.

Short answer is: repository probably should not be part of the conan PURL.

It seems that conan does not include the repository that a package came from in their lockfiles, and that is intentional because conan aims to be decentralized. For a given constraint or package ref, the client will iterate over a list of remote repos defined by the user until it finds one that can provide the specified package. The way that they guarantee that you will continue to get the same package, as defined in the lockfile, is that they have a "revision" feature that gives each recipe and package a unique hash based on their contents, and so as long as a build with the same binary exists in one of the remote repositories, it is expected to be the desired dependency and they don't care which repo it comes from.

And so, given the decentralized nature of conan, it seems that a conan package-url should probably never have a repository. Given their guide on how package revisions are referenced, I think that the following package-urls would probably be desired:

// lib/1.0@conan/stable (name=lib, version=1.0, user=conan, channel=stable)
// refers to the latest revision
pkg:conan/conan/stable/lib@1.0

// lib/1.0@conan/stable#RREV (for specific recipe revision)
pkg:conan/conan/stable/lib@1.0?recipe_revision=RREV

// lib/1.0@conan/stable#RREV:PACKAGE_ID (for specific recipe revision and package ID)
// FYI package_id is another dimension that is a hash of the system/build requirements, ie. windows/linux, x86/x64...
pkg:conan/conan/stable/lib@1.0?recipe_revision=RREV&package_id=PACKAGE_ID

// lib/1.0@conan/stable#RREV:PACKAGE_ID#PREV (for specific recipe revision and package ID and package revision)
pkg:conan/conan/stable/lib@1.0?recipe_revision=RREV&package_id=PACKAGE_ID&package_revision=PREV

Worth also mentioning that in conan 1.X, by default revisions are turned off, and in that case RREV and PREV can both be 0 in which case I believe the latest revision is inferred

// lib/1.0@conan/stable#0:PACKAGE_ID#0
pkg:conan/conan/stable/lib@1.0?recipe_revision=0&package_id=PACKAGE_ID&package_revision=0

My understanding of conan is still evolving though, so this is TBD.