connectrpc / connect-go

The Go implementation of Connect: Protobuf RPC that works.
https://connectrpc.com
Apache License 2.0
2.76k stars 91 forks source link

Missing Go package metadata on connectrpc.com #687

Closed GeertJohan closed 4 months ago

GeertJohan commented 4 months ago

Describe the bug

The new import paths based on connectrpc.com cause errors when using with standard go tools. It seems that the website returns a 404 for valid import paths and is missing the <meta name="go-import" ...> information in it's body.

To Reproduce

$ go list -m -versions -json connectrpc.com/connect/cmd/protoc-gen-connect-go
go: unrecognized import path "connectrpc.com/connect/cmd/protoc-gen-connect-go": reading https://connectrpc.com/connect/cmd/protoc-gen-connect-go?go-get=1: 404 Not Found

It would be nice if the website could return the metadata and/or redirect to https://pkg.go.dev/connectrpc.com/connect/cmd/protoc-gen-connect-go (I think that's how google.golang.org handles it, by just 302 redirecting to pkg.go.dev).

Please see https://pkg.go.dev/cmd/go (ctrl+f for ?go-get=1).

Additional context

I'm using mise to install tools. It relies on go list -m -versions -json to obtain module/package information. This works great for domains like github.com and google.golang.org, but fails for connectrpc.com.

jhump commented 4 months ago

@GeertJohan, the server does in fact return the correct metadata. But it does not do so for every package in the module. It is only necessary to provide that metadata for the root of the module.

The problem you're hitting is that you are using the -m and -versions arguments, which are for modules, not packages. But the import path you indicated connectrpc.com/connect/cmd/protoc-gen-connect-go is not a module. It is a package inside the connectrpc.com/connect module.

So the following do work:

go get connectrpc.com/connect/cmd/protoc-gen-connect-go
go install connectrpc.com/connect/cmd/protoc-gen-connect-go
go list -json connectrpc.com/connect/cmd/protoc-gen-connect-go
go list connectrpc.com/connect/...

And if you want to see the versions of protoc-gen-connect-go, you have to get the version metadata for the enclosing module:

go list -m -versions -json connectrpc.com/connect
GeertJohan commented 4 months ago

Hi @jhump thanks for your fast response!

I understand that the go list is supposed to be called with a module. It is impossible to detect by just the import path string whether it is a module root or a package. The examples you provided don't error but also don't give any hints to where the module root is located. So mise iterates over the path by removing the last part until a module is detected via go list -m ... For other domains this doesn't error, go list returns with exit code 0, but does not provide a list of versions (since it was called with an import path that was not a module). This works fine for github and other domains, for example go list -m -versions -json github.com/go-task/task/v3/cmd/task does not return an error. I'm not entirely sure how that even works though, I just discovered github also serves a 404 without go-import metadata on http://github.com/go-task/task/v3/cmd/task?go-get=1, so I'm a little confused now.. Perhaps they're detecting the User-Agent and serve a different result based on that? Any ideas?

jhump commented 4 months ago

I'm not entirely sure how that even works though, I just discovered github also serves a 404 without go-import metadata ...

@GeertJohan, the go tool only queries for an HTML page for domains it does not know. And github.com is a domain it knows: https://codeengineered.com/blog/2016/go-exposed-remote-import-paths/

The examples you provided don't error but also don't give any hints to where the module root is located.

This is not true. The module is part of the package information returned by go list. So the go list -json connectrpc.com/connect/cmd/protoc-gen-connect-go command does in fact tell you the module, along with all other metadata for the package.

So, in particular, I guess you want to first do this:

go list -f '{{.Module.Path}}' connectrpc.com/connect/cmd/protoc-gen-connect-go

This will report the module, which you can then turn around and use with go list -m -versions.

GeertJohan commented 4 months ago

I think the downside using that command is that it can't be called without having an active go module and needing to already have go get the package. It's a lot cheaper to traverse the path until eventually hitting the module.

jhump commented 4 months ago

I suspect go get does exactly as you describe already: it traverses the path until it finds the module. So it probably tries the given path, and, if it cannot be resolved, it peels off a path element from the right and tries again, and repeats until it finds a valid module.

You could always try that, too: run the go list command, and, if it fails, remove the right-most path entry and try again, until it resolves or there are no more path elements. In fact, here's a script that does that. You can accomplish what you're looking for with go-list.sh connectrpc.com/connect/cmd/protoc-gen-go:

#!/bin/bash

set -euo pipefail
IFS=$'\n\t'

if [[ $# != 1 ]]; then
  echo "usage:" >&2
  echo "   $(basename $0) package-path" >&2
  exit 1
fi

name="$1"
while true; do
  if go list -m -versions -json -- "${name}" 2>/dev/null; then
    exit 0
  fi
  dir="${name%/*}"
  if [[ "$dir" == "$name" ]]; then
    # no more path elements
    break
  fi
  name="$dir"
done

# one last time with full path to show errors
go list -m -versions -json -- "${1}"
akshayjshah commented 4 months ago

I suspect go get does exactly as you describe already: it traverses the path until it finds the module. So it probably tries the given path, and, if it cannot be resolved, it peels off a path element from the right and tries again, and repeats until it finds a valid module.

I believe that's true, and that it's new behavior since the introduction of Go modules (which I guess are pretty old now).

GeertJohan commented 4 months ago

What go-list.sh does is basically what mise does right now, except that go-list.sh ignores the errors from go list calls. It works, and is a way to work around the inconsistency. That could be duplicated in mise, although in general I think errors shouldn't be ignored, especially not when they're inconsistent. It would probably be good for the Go project to be more explicit on the design/implementation-specs of the metadata for packages. The inconsistency now makes that go list -m -versions -json helm.sh/helm/v3/cmd/helm responds different than go list -m -versions -json connectrpc.com/connect/cmd/protoc-gen-go.

Also it would be useful for go list -m -json to output the module path.

I'll head over to the go repo to write an issue. Any other thoughts/ideas I should take?

It would be nice if connectrpc.com could add the metadata on the package page as well, although I also agree it's not very well specified and may be better to wait for specs before making the effort. Thanks @jhump !

akshayjshah commented 4 months ago

For the issue on the Go project, I'd suggest that go list -m use the same logic as go get and go install to truncate paths until it finds the containing module.

Unfortunately, we're not going to be able to serve module metadata on per-package pages (apart from the module root) without significantly complicating connectrpc.com. We'll consider it if it impacts a significant fraction of our user base, but so far you're the first person reporting a problem. To me, it's also reasonable for mise to fix this regardless of the Go team's response - presumably the author wants mise to work whenever go install would.

jhump commented 4 months ago

I'll head over to the go repo to write an issue. Any other thoughts/ideas I should take?

I think the issue here is that go list -m does not behave consistently with go install and go get. Ideally, go list -m would be able to resolve the given name in the same way.

It's worth pointing out that go list -m in general does not support package names, only module names. For example:

> go list -json -m golang.org/x/net/http2
go: module golang.org/x/net/http2: not a known dependency

It prints this even when golang.org/x/net is an explicitly required dep in my go.mod because it wants a module name, not a package name.

However, go list -m does happen to work with a package name, but only if the -versions flag is supplied:

> go list -json -m -versions golang.org/x/net/http2
{
    "Path": "golang.org/x/net/http2",
    "Origin": {
        "VCS": "git",
        "URL": "https://go.googlesource.com/net",
        "Subdir": "http2",
        "TagPrefix": "http2/",
        "TagSum": "t1:47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU="
    }
}

This looks like it's trying provide information about a module. In particular, does not tell you the actual module that contains the given package name. While it does have a Subdir property, that is relative to the VCS root, not to some module root.

So I think it's possible that the Go team will say that it's working as intended and that you should only use module names with go list -m.