Open TG1999 opened 8 months ago
I think the v is part of the version in Go and needs to be present. https://github.com/golang/go/issues/32945 If purls were written without the v and then Go started doing something different, it would break all purl implementations.
@TG1999 @matt-phylum go is a mess in this domain. :smiling_imp:
I'm inclined to accept the versions as they are with their v prefix, but then these are not the semver versions that go moduled promised anymore short of stripping the leading v.
So we need to agree on a canonical way (and document this preferred canonical way in the types doc) and also accept that unfortunately tools will have to deal with prefixed and unprefixed versions, and will need to strip the prefix to compare version properly in all cases and alo query some databases.
This is done here for instance https://github.com/package-url/purl-spec/blob/master/PURL-TYPES.rst#golang
The Go section of the spec is in dire need of updates. The version and subpath stuff there implies that it's talking about Go packages and Go modules aren't supported. However, if the version specification were relaxed to a Git reference instead of a commit ID (truncated to an unspecified length), then the v must be included because the v is part of the tag name and is significant to Git. Commit IDs and other Git references are not versions and cannot be compared, but in Go if a tag begins with v and contains a valid version number then it is a comparable version.
It looks like since Go started using modules, the examples need to be updated:
pkg:golang/github.com/gorilla/context@v0.0.0-??????????????-234fd47e07d1
This example is currently invalid because the commit 234fd47e07d1004f0aed9c
does not exist in the repository. The question marks are supposed to be the commit timestamp.pkg:golang/google.golang.org/genproto/googleapis/api#annotations
googleapis/api
is part of the module name and cannot be in the subpath.pkg:golang/github.com/gorilla/context@v0.0.0-??????????????-234fd47e07d1#api
This example is invalid because the commit does not exist and neither does the specified subpath.Proper examples with versions:
pkg:golang/v.io@v0.2.0
This example trips up implementations that incorrectly handle namespace+name. The namespace is optional because not all Go module names contain slashes.pkg:golang/golang.org/x/net@v0.0.0-20180925071336-cf3bd585ca2a#context
This example refers to a version that predates modules. It would have previously been given as something like pkg:golang/golang.org/x/net@cf3bd585ca2a#context
.Another confusion for go: Is it all a name or does it have a namespace confusion for go
The spec is clear that Go packages have PURL namespaces, even if the concept does not exist in Go. What's missing is that Go packages only sometimes have PURL namespaces because not all Go package IDs contain slashes.
I guess the problem with Go (and NPM) packages is that even if your PURL implementation is correct, it's up to the application to correctly handle this namespace/name split and join translation, and users are unlikely to read the spec when they have the library to handle that for them. Maybe slashes in the names of Go packages should be forbidden to stop users from unknowingly doing the wrong thing and because of the way the names work it shouldn't be possible for slashes in the name to get some other meaning where they would need to be accepted later.
@matt-phylum, we should fix the purl spec for go IMHO. Go was the only team with some reservations during the last IETF submission if my memory is correct.
Go would change from [0,n-1) of the Go package ID split by /
in the PURL namespace and segment n-1 in the PURL name, to the PURL name and the Go name being equal. Ergonomics are improved because the user no longer needs to split and join.
NPM would change from the NPM namespace in the PURL namespace and the NPM name in the PURL name to the full package name in the PURL name. NPM does have namespaces, but most of the time you don't need to be aware of them and just use the full package name, and it would be possible to do the same with PURL. Ergonomics are improved because the user no longer needs to split and join.
Maven would change from the Maven group ID in the PURL namespace and the Maven artifact ID in the PURL name to "
Rewriting the spec this way shouldn't change the representation of any packages, so even though it would be a breaking API change for libraries, it wouldn't be a breaking change for the ecosystem and we wouldn't need to migrate everything to an incompatible PURL2 or deal with Go PURLs that are full of %2F
escapes.
¹ Is this alone okay? For URL, the path segments are tricky. If you use a normal URL parser and ask for the full path of the URL, it needs to give you the path without fully percent decoding it in case /
vs %2F
is a meaningful distinction (eg it's a route parameter character, not a path segment delimiter). Separating the segments is supposed to happen before decoding. For PURL, as long as none of the existing package types have valid packages where the current name-without-namespace field is expected to contain a slash, and we don't expect package types to add such a requirement later, it should be safe for the library to return a single decoded name string.
If we look at go packages like these https://github.com/go-jose/go-jose/archive/refs/tags/v4.0.1.zip https://pkg.go.dev/github.com/go-jose/go-jose/v3?tab=versions, they have a leading v in them. Whereas if we look in osv.dev they are stored without any leading v https://osv.dev/vulnerability/GHSA-c5q2-7r4c-mv6g.
So how should we store this as a purl ?
pkg:golang/github.com/go-jose/go-jose/v4@v4.0.1
orpkg:golang/github.com/go-jose/go-jose/v4@4.0.1
?