google / deps.dev

Resources for the deps.dev API
https://deps.dev
Apache License 2.0
231 stars 16 forks source link

PurlLookup does not work for Go module names containing uppercase letters #93

Open pxp928 opened 3 weeks ago

pxp928 commented 3 weeks ago

Querying for: pkg:golang/github.com/antlr/antlr4/runtime/Go/antlr@v0.0.0-20220418222510-f25a4f6275ed fails and returns nothing with the error:

{"level":"info","ts":1718224007.210796,"caller":"deps_dev/deps_dev.go:410","msg":"failed to lookup purl: pkg:golang/github.com/antlr/antlr4/runtime/Go/antlr@v0.0.0-20220418222510-f25a4f6275ed, error: rpc error: code = NotFound desc = version not found","guac-version":"v0.0.1-custom"}

I did the query via the new deps.dev/api/v3alpha and v3 and they both return the same error.

but in reality, it does exist on deps.dev webpage: https://deps.dev/go/github.com%2Fantlr%2Fantlr4%2Fruntime%2FGo%2Fantlr/v0.0.0-20220418222510-f25a4f6275ed

Is there a reason for this discrepancy?

sarnesjo commented 3 weeks ago

Hi @pxp928! Just to clarify, you said you tried this in both the v3 and v3alpha APIs, but purls are only supported in v3alpha, via the PurlLookup and PurlLookupBatch methods. I'll assume you're using one of them.

Unfortunately, the issue you report isn't one deps.dev can fix, at least not without a bit of help, as it's due to a mismatch between the purl spec and the actual behavior of the Go ecosystem. Here's what's going on:

The purl spec for the golang type states:

The namespace and name must be lowercased.

We use the canonical Go implementation, which follows this, meaning that it parses the purl in your request, pkg:golang/github.com/antlr/antlr4/runtime/Go/antlr@v0.0.0-20220418222510-f25a4f6275ed, with a namespace and name of github.com/antlr/antlr4/runtime/go and antlr. Note the lowercase g in /go, which is the reason it is not found.

The purl spec is incorrect in this regard; Go module names are case-sensitive. For example:

go get github.com/antlr/antlr4/runtime/go/antlr # does not work
go get github.com/antlr/antlr4/runtime/Go/antlr # works

This is also reflected by other tooling in the Go ecosystem, such as pkg.go.dev:

More information can be found in the documentation for the protocol used by the Go module proxy. It's also mentioned in this issue.

So, to fix the issue you report, the purl spec and implementation both need to be updated. I've added a comment on this issue

Meanwhile, as an option, you can instead use the GetVersion method of the deps.dev API (available in v3 and v3alpha), which takes a version key instead of a purl.

pxp928 commented 3 weeks ago

Hello @sarnesjo! Thank you for the response.

Sorry, yes for v3 we are using the GetVersion method:

*v3.VersionKey {state: google.golang.org/protobuf/internal/impl.MessageState {NoUnkeyedLiterals: google.golang.org/protobuf/internal/pragma.NoUnkeyedLiterals {}, DoNotCompare: google.golang.org/protobuf/internal/pragma.DoNotCompare [], DoNotCopy: google.golang.org/protobuf/internal/pragma.DoNotCopy [], atomicMessageInfo: *(*"google.golang.org/protobuf/internal/impl.MessageInfo")(0x14000453950)}, sizeCache: 0, unknownFields: []uint8 len: 0, cap: 0, nil, System: System_GO (1), Name: "github.com/antlr/antlr4/runtime/go/antlr", Version: "v0.0.0-20220418222510-f25a4f6275ed"}

and we were getting a similar error: "version not found".

retry it again with (Capital G in Go):

v3.VersionKey {state: google.golang.org/protobuf/internal/impl.MessageState {NoUnkeyedLiterals: google.golang.org/protobuf/internal/pragma.NoUnkeyedLiterals {}, DoNotCompare: google.golang.org/protobuf/internal/pragma.DoNotCompare [], DoNotCopy: google.golang.org/protobuf/internal/pragma.DoNotCopy [], atomicMessageInfo: *(*"google.golang.org/protobuf/internal/impl.MessageInfo")(0x14000421950)}, sizeCache: 80, unknownFields: []uint8 len: 0, cap: 0, nil, System: System_GO (1), Name: "github.com/antlr/antlr4/runtime/Go/antlr", Version: "v0.0.0-20220418222510-f25a4f6275ed"}

works properly and does not return an error.

It seems like both versions of the API are facing the same case sensitivity issue.

lumjjb commented 3 weeks ago

yea looks like its looking for a different convention depending on querying by versionKey compared to query by PURL. What would be the equivalent PURL that key

System: System_GO (1), Name: "github.com/antlr/antlr4/runtime/Go/antlr", Version: "v0.0.0-20220418222510-f25a4f6275ed"

I tried the following that didn't yield results :|

curl -d @- 'https://api.deps.dev/v3alpha/purlbatch' <<EOF
{
  "requests":[
    {"purl":"pkg:golang/github.com/antlr/antlr4/runtime/go/antlr@v0.0.0-20220418222510-f25a4f6275ed"}
  ]
}
EOF
{"responses":[{"request":{"purl":"pkg:golang/github.com/antlr/antlr4/runtime/go/antlr@v0.0.0-20220418222510-f25a4f6275ed"}}], "nextPageToken":""}%
pxp928 commented 3 weeks ago

v3Alpha with purl fails with both: pkg:golang/github.com/antlr/antlr4/runtime/go/antlr@v0.0.0-20220418222510-f25a4f6275ed and pkg:golang/github.com/antlr/antlr4/runtime/Go/antlr@v0.0.0-20220418222510-f25a4f6275ed

v3 with getVersion does not work with the name: github.com/antlr/antlr4/runtime/go/antlrbut does work when github.com/antlr/antlr4/runtime/Go/antlr (capital G)

sarnesjo commented 3 weeks ago

Let me try to untangle the issue into the parts that relate to Go, to purl, and to deps.dev.

In Go, module names are case sensitive. github.com/antlr/antlr4/runtime/Go/antlr and github.com/antlr/antlr4/runtime/go/antlr are not equivalent. In this case, the former is the name of a module that exists, and the latter is not.

In the deps.dev API, calling the GetVersion method with github.com/antlr/antlr4/runtime/Go/antlr works (as expected) and calling it with github.com/antlr/antlr4/runtime/go/antlr returns "not found" (as expected). So far so good. Calling the PurlLookup method returns "not found" either way. That is a bug.

The cause for the bug is that the purl implementation (which we use in the deps.dev API server implementation) lowercases the name when parsing. It does that because the purl spec says it must.

So, to properly fix this issue, we need to change the purl spec and then change the implementation.

jkowalleck commented 1 week ago

see https://github.com/package-url/purl-spec/issues/308

prabhu commented 1 week ago

I would propose exploring the use of qualifiers. Example:

pkg:golang/github.com/antlr/antlr4/runtime/go/antlr?module_name=github.com%2Fantlr%2Fantlr4%2Fruntime%2FGo%2Fantlr

Here module_name is URI encoded to contain the original module name for lookup purposes. This approach is already common in some environments. Example: enterprises that use a private registry with patches use a qualifier such as repository_url to distinguish purls. Some projects like depscan use the distro_name, distro_version qualifiers to tune false positives.