Open giterlizzi opened 1 year ago
Hei!
I've been mulling about this ticket a while now; and here are a couple thoughts for your consideration. Please note that some of this is produced from memory, so it's possible that I may be mistaken on some points – please correct me if you find something wrong! (thank you :grin:)
cpanm pkg:cpan/SRI/Mojolicious
distribution
name – SRI/Mojolicious-9.35.tar.gz
module
name – Mojo::Base
distribution
contains one or more modules, but not necessarily in the same namespace as indicated by the distribution name.module
in the namespace part of the PURL. This is to avoid namespace collision between CPAN id's and module names: pkg:cpan/module/Foo::Bar
02packages.details.txt
files on your mirror.
cpan
, cpanm
, cpm
and cpanp
. Some tooling uses these indirectly, e.g. carton
, carmel
and dh-make-perl
. Or even from CPAN mirror software like Pinto
or CPAN::Mini
or App::opan
..packlist
and perllocal.pod
files throughout the designated installation tree. These are less than ideal for figuring out the pedigree of an installed module.(Updated 2024-01-19)
Related, NIST has published a Software Identification Ecosystem Option Analysis where they talk a little about the contexts where PackageURLs may be used. Very useful reflections, and recommended reading.
They specifically look for something they call "Grouping", which they for some reason claim is a "missing feature" in purls. (I may have misunderstood something here).
Not sure of it's relevancy for this module either, but the idea is out there, so possibly necessary to consider.
Having thought a little more about this, I'm currently considering the following proposals....
--prefer-cpan
parameter to have the tool prioritizing downloading dependencies from CPAN, instead of shelling out to apt install libfoo-bar-perl
on a Debian system)
I guess I'm pretty much echoing what you've already have proposed, with the difference of explicitly adding "module" (in lowercase) to the PURL, to make it easily distinguishable from distribution names, which have to be in uppercase; And making a point out of having separate API methods that produce each of these explicitly.
So, with this I've been trying to think about about it from an "independent" starting point, and basically ended up where you and @mrdvt92 in #2 have arrived.
So for whatever it's worth, I'm happy to stand behind what's here, plus the perspectives in #2. :smiley_cat:
@giterlizzi, I just learned that the PackageURL spec author is working on getting it registered as an ECMA standard. Maybe it's time to get the CPAN bits included?
source: https://youtu.be/B2bVaaeqpAk?si=c7cdfDZCEJkucOic&t=623
By the way!
When in comes to specifying (pre-resolution) dependencies, there's a version-range spec for purl. Should we adopt this at the same time, while we're at it?
https://github.com/package-url/purl-spec/tree/version-range-spec
Maybe it's time to get the CPAN bits included?
Yes, I think we can start validating the specification described in the first comment (Components and Qualifiers) and open a PR to include it in https://github.com/package-url/purl-spec/blob/master/PURL-TYPES.rst
Apparently, there's a pull request open already at https://github.com/package-url/purl-spec/pull/155 - maybe worth updating?
Also, I expect to meet the purl author, Philippe Ombredanne, in Brussels tomorrow. If you want, I can ask him what's needed to get this PR merged?
Apparently, there's a pull request open already at package-url/purl-spec#155 - maybe worth updating?
If you agree I would modify it like this:
cpan
for CPAN Perl packages:
The default respository is https://www.cpan.org/
.
To search CPAN it is recommended to use https://metacpan.org
.
The namespace
is optional; it may be used to specify the author name and it must be uppercased.
The name
is the module or distribution name and is case sensitive.
The version
is the module or distribution version.
Optional qualifiers may include:
repository_url
: CPAN/MetaCPAN/BackPAN/DarkPAN repository base URL (default is https://www.cpan.org
)download_url
: URL of package or distibutionvcs_url
: extra URL for a package version control systemext
: file extension (default is tar.gz
)Examples::
pkg:cpan/Perl-Version@1.013 pkg:cpan/DateTime@1.55 pkg:cpan/GDT/URI-PackageURL@2.04 pkg:cpan/LWP::UserAgent@6.76 pkg:cpan/OALDERS/libwww-perl@6.76
Also, I expect to meet the purl author, Philippe Ombredanne, in Brussels tomorrow. If you want, I can ask him what's needed to get this PR merged?
It would be great. Thank you!
The name is the module or distribution name and is case sensitive.
The more I think about it, I believe only CPAN distributions should be supported and not modules or packages.
.pm
file) does not have a 1:1 relationship to a package. A module is a single file with zero or more packages inside it.I propose to only use /dist/
to match the meta URL e.g., https://metacpan.org/dist/Perl-Version
pkg:cpan/dist/Perl-Version@1.013
If we really must use modules, does each module in a distribution need to be specified?
Since modules don't really have versions, are checksum=sha:XXXXXX
signature mandatory?
- The namespace is optional; it may be used to specify the author name and it must be uppercased.
Aaah, no, let's NOT word it like this. Instead, I propose this -
Correct examples:
pkg:cpan/Perl::Version@1.013
pkg:cpan/DROLSKY/DateTime@1.55 (distribution name)
pkg:cpan/DateTime@1.55 (module name)
pkg:cpan/GDT/URI-PackageURL
pkg:cpan/LWP::UserAgent
pkg:cpan/OALDERS/libwww-perl@6.76
pkg:cpan/URI (module name)
Incorrect syntax examples:
pkg:cpan/Perl-Version@1.013
pkg:cpan/DateTime@1.55
pkg:cpan/GDT/URI::PackageURL
pkg:cpan/LWP-UserAgent
pkg:cpan/OALDERS/
pkg:cpan/dist/Perl-Version@1.013
If we really must use modules, does each module in a distribution need to be specified?
Modules do have versions (see https://www.cpan.org/modules/02packges.details.txt for documentation)
When using a PackageURL to refer to a module, the intention is to a ecosystem-specific tool to resolve which distribution a specific module belongs to. This is already what happens when running cpanm Foo::Bar
– the tool downloads 02packages.details.txt and does a lookup there to figure out which distribution to download. This lookup works with packages (defined as namespaces, of which you may have one or more off inside a .pm file) and with modules (defined as a .pm file with a single package namespace matching the file name), and distributions (a tarball containing one or more modules or packages).
Note also that a distribution name MUST contain the author's CPAN id to be valid! That's why I'm insisting that a PackageURL referring to a dist also must live up to this. (The reason why this is so, is that it's possible for several authors to make releases for the same distribution, and allow users later to refer to which of them they want)
- The namespace is optional; it may be used to specify the author name and it must be uppercased.
Aaah, no, let's NOT word it like this. Instead, I propose this -
* To refer to a CPAN distribution name, the namespace MUST be present. In this case, the namespace is the CPAN id of the author/publisher. It MUST be written uppercase, followed by '/' and then followed by the distribution name. A distribution name may NEVER contain the string '::'. * To refer to a CPAN module, the namespace MUST be absent. The module name MAY contain zero or more '::' strings, and the Module name MUST NOT contain a '-'
Correct examples:
pkg:cpan/Perl::Version@1.013 pkg:cpan/DROLSKY/DateTime@1.55 (distribution name) pkg:cpan/DateTime@1.55 (module name) pkg:cpan/GDT/URI-PackageURL pkg:cpan/LWP::UserAgent pkg:cpan/OALDERS/libwww-perl@6.76 pkg:cpan/URI (module name)
Incorrect syntax examples:
pkg:cpan/Perl-Version@1.013 pkg:cpan/DateTime@1.55 pkg:cpan/GDT/URI::PackageURL pkg:cpan/LWP-UserAgent pkg:cpan/OALDERS/
I agree !
@sjn Have added a initial check for "cpan" purl type
purl-tool pkg:cpan/GDT/URI::PackageURL
ERROR: Invalid Package URL: CPAN 'name' must have the distribution name
purl-tool pkg:cpan/URI-PackageURL
ERROR: Invalid Package URL: CPAN 'name' must have the module name
purl-tool pkg:cpan/G::DT/URI::PackageURL
ERROR: Invalid Package URL: CPAN 'namespace' must have the distribution author
If we can get a purl-spec PR for this made, we can have it merged lunchtime today! 🤩
If we can get a purl-spec PR for this made, we can have it merged lunchtime today! 🤩
:smiley:
Changed the specification.
cpan
for CPAN Perl packages:
The default respository is https://www.cpan.org/
.
To search CPAN it is recommended to use https://metacpan.org
.
The namespace
:
namespace
MUST be present. In this case, the namespace is the CPAN id of the author/publisher. It MUST be written uppercase, followed by the distribution name in the name
component. A distribution name may NEVER contain the string ::
.namespace
MUST be absent. The module name MAY contain zero or more ::
strings, and the module name MUST NOT contain a -
The name
is the module or distribution name and is case sensitive.
The version
is the module or distribution version.
Optional qualifiers may include:
repository_url
: CPAN/MetaCPAN/BackPAN/DarkPAN repository base URL (default is https://www.cpan.org
)download_url
: URL of package or distibutionvcs_url
: extra URL for a package version control systemext
: file extension (default is tar.gz
)Examples::
pkg:cpan/Perl::Version@1.013 pkg:cpan/DROLSKY/DateTime@1.55 pkg:cpan/DateTime@1.55 pkg:cpan/GDT/URI-PackageURL pkg:cpan/LWP::UserAgent pkg:cpan/OALDERS/libwww-perl@6.76 pkg:cpan/URI
Great! Do you have a PR link I can refer to?
This is the new PR https://github.com/package-url/purl-spec/pull/288
One question;
Is it really necessary to mention MetaCPAN at all?
One question;
Is it really necessary to mention MetaCPAN at all?
You mean this ?
To search CPAN it is recommended to use https://metacpan.org.
Congratulations with getting this merged into the spec! :-D
Now the work starts with getting purls supported in other parts of the Perl/CPAN toolchain!
(btw, I've tried to reach out to you on twitter/x; are there better channels for reaching you?)
Package URL
A Package URL (aka "purl") is a URL string used to identify and locate a software package in a mostly universal and uniform way across programing languages, package managers, packaging conventions, tools, APIs and databases.
https://github.com/package-url/purl-spec
A purl is a URL composed of seven components:
Components are separated by a specific character for unambiguous parsing.
The defintion for each components is:
scheme
: this is the URL scheme with the constant value of "pkg". One of the primary reason for this single scheme is to facilitate the future official registration of the "pkg" scheme for package URLs. Required.type
: the package "type" or package "protocol" such as maven, npm, nuget, gem, pypi, etc. Required.namespace
: some name prefix such as a Maven groupid, a Docker image owner, a GitHub user or organization. Optional and type-specific.name
: the name of the package. Required.version
: the version of the package. Optional.qualifiers
: extra qualifying data for a package such as an OS, architecture, a distro, etc. Optional and type-specific.subpath
: extra subpath within a package, relative to the package root. Optional.Package URL for CPAN Packages
Components
Minimal components:
type
for CPAN Perl packages and ditribution iscpan
name
is the module or distribution name and is case sensitiveOptional (but advised) components:
namespace
is the author name. It is must be uppercasedversion
is the package or distribution versionQualifiers
Optional qualifiers may include:
repository_url
, CPAN/MetaCPAN/BackPAN/DarkPAN repository base URL (default ishttps://www.cpan.org
)download_url
, URL of package or distibutionvcs_url
, extra URL for a package version control systemext
, file extension (default istar.gz
)Extras
https://www.cpan.org
https://metacpan.org
Examples
Minimal "purl" string:
"purl" string with
namespace
(author) component:"purl" string with
repository_url
qualifier:"purl" string with
vcs_url
qualifier: