compiler-explorer / compiler-explorer

Run compilers interactively from your web browser and interact with the assembly
https://godbolt.org/
BSD 2-Clause "Simplified" License
15.56k stars 1.66k forks source link

[REQUEST]: Add ability to use external libraries for Golang #5242

Open prateek opened 11 months ago

prateek commented 11 months ago

Is your feature request related to a problem? Please describe

My use-case: When using Golang in Compile-Explorer, I often find myself having to "simplify" snippets of external libraries so that I can plug it as input. I can't reference imports outside golang's standard library.

Describe the solution you'd like

Prior art

Describe alternatives you've considered

Today, if I run into issues where I want to check something "small enough" in an external library, I'll do the surgery to rip out the bits I need. If I anticipate it's going to take more longer than 5-10minutes to do so, I resort to not using compiler-explorer for that case.

Additional context

I asked this question in the #compiler_explorer slack channel on cpplang, and was given a few notes for how external library support was handled for Rust. References:

prateek commented 11 months ago

I'd be happy to find some cycles to prototype and do the work to make this happen if I could get pointers to the right places to look.

partouf commented 11 months ago

I'd be happy to find some cycles to prototype and do the work to make this happen if I could get pointers to the right places to look.

Any bash script that has 2 separate steps for 1. to build the library, and 2. to link to the library - will do just fine, you don't have to make it work with CE, we can handle that.

prateek commented 10 months ago

Typically, you don't link/create an object file for a library as a separate step in the Go toolchain. Here's what a simplified example of how one might go about using an external lib in Go (the following commands will work assuming you have a go binary available in PATH).

#!/usr/bin/env bash

set -eo pipefail

tmpdir=$(mktemp -d)
cd $tmpdir
cat >main.go <<EOF
// use a library github.com/google/uuid in my Go program
package main 
import "github.com/google/uuid" 
func main()  {
  println(uuid.New().String())
}
EOF

go mod init compiler-explorer/testmain
go mod tidy 
go run main.go 

# or if you want to see the assembly `go build -o main main.go && go tool objdump -S main`

The go mod tidy command tells the Go toolchain to download the dependency from the appropriate url (it tries to from the local gomodule cache first, and if it doesn't find it there - downloads it - from the package url; or you can make it respect proxies and so on).

partouf commented 10 months ago

Typically, you don't link/create an object file for a library as a separate step in the Go toolchain. Here's what a simplified example of how one might go about using an external lib in Go (the following commands will work assuming you have a go binary available in PATH).

#!/usr/bin/env bash

set -eo pipefail

tmpdir=$(mktemp -d)
cd $tmpdir
cat >main.go <<EOF
// use a library github.com/google/uuid in my Go program
package main 
import "github.com/google/uuid" 
func main()  {
  println(uuid.New().String())
}
EOF

go mod init compiler-explorer/testmain
go mod tidy 
go run main.go 

# or if you want to see the assembly `go build -o main main.go && go tool objdump -S main`

The go mod tidy command tells the Go toolchain to download the dependency from the appropriate url (it tries to from the local gomodule cache first, and if it doesn't find it there - downloads it - from the package url; or you can make it respect proxies and so on).

We do not provide any internet connection, but we can go the cache route if we can change the directory from which the modules come from.

But we also need to maintain some kind of versioning and can then be different for every compilation, so it's going to be complicated.

Any links to where we can find details about the caching? I don't think proxying is a good route as that also involves internet, unless we can't customize caching then we can maybe figure something out.

prateek commented 10 months ago

Hmm - if you wanna go the offline route. How do you imagine new libraries/upgrades will be handled?

Re: caching details - Best I can figure out from docs, go looks for packages in $GOPATH/pkg/mod first, if it misses, it tries to download them. Hooking into this via the GOPROXY env var seems straight forward - a quick search reveals a directory backed implementation available here.

Re: versioning - yeah that's a fair concern. The upstream Go playground assumes @latest for version. They had considered exposing some magic comment syntax next to imports to allow overrides, but to my knowledge they haven't done so yet.

Clement-Jean commented 1 month ago

AFAIK, go build a compilation graph. The main component of this graph is the importcfg which is a file containing path to .a files.

So if you have the following:

package main

import "fmt"

func main() {
  fmt.Println("test")
}

what is happening under the hood is the command: go tool compile -importcfg _pkg_.a main.go, where _pkg_.a is an archive of the standard library.

Doesn't this mean we could create archives for the libraries, add them all to a global importcfg, and compile against this file?

This would mean that we have a script like this (example for protobuf):

pushd ./third_party/protobuf > /dev/null

pkgs=($(go list -f '{{.ImportPath}}' ./...))
dirs=($(go list -f '{{.Dir}}' ./...))

for idx in "${!pkgs[@]}"
do
  cd ${dirs[$idx]}
  go build -buildmode archive -o _pkg_.a .

  if [ $? -eq 0 ]; then
    echo "packagefile ${pkgs[$idx]}=${dirs[$idx]}/_pkg_.a"
  fi
done

popd > /dev/null

that create _pkg_.a files for each subpackages and we could redirect that to the global importcfg (./build_pkg.sh >> importcfg).

stapelberg commented 4 weeks ago

Hello from the Go team πŸ‘‹ β€” I’d be happy to help route detail questions to the right people to move this discussion forward.

The suggested approach of creating an importcfg file and thereby splitting the build into two phases (build .a files, link) should work fine.

However: I wonder if such complexity / the low-level integration is really needed, though?

An easier solution might be to save/restore a populated module cache directory (location depends on GOPATH, ~/go/pkg/mod by default). The actions/setup-go GitHub Actions step also arranges for the module cache to persist between runs, which means many CI runs don’t need to contact the internet at all.

Note that you don’t need to run a custom proxy to serve the cache β€” if the cache is present, it will be used.

You can use go mod download to pre-populate the cache. Here’s an example transcript:

# Configure a new, empty location for the cache (part of GOPATH):
~/compiler-explorer-poc/example % export GOPATH=$HOME/compiler-explorer-poc/gp

# Demonstrate that the example program (https://pkg.go.dev/google.golang.org/protobuf/proto#example-Marshal) does not build without internet access:
~/compiler-explorer-poc/example % GOPROXY=off go build
go: downloading google.golang.org/protobuf v1.34.1
hello.go:6:2: module lookup disabled by GOPROXY=off
hello.go:7:2: module lookup disabled by GOPROXY=off

# Completely independently of the example program, load Protobuf into the cache:
~/compiler-explorer-poc/example % cd ..
~/compiler-explorer-poc % go mod download google.golang.org/protobuf@latest

# Show that the example program now builds (without any downloads):
~/compiler-explorer-poc % cd example 
~/compiler-explorer-poc/example % GOPROXY=off go build
~/compiler-explorer-poc/example % 

BTW, if you’re curious, you can use go build -x to see all the commands that the go tool runs.

partouf commented 4 weeks ago

You can use go mod download to pre-populate the cache. Here’s an example transcript:

# Configure a new, empty location for the cache (part of GOPATH):
~/compiler-explorer-poc/example % export GOPATH=$HOME/compiler-explorer-poc/gp

# Completely independently of the example program, load Protobuf into the cache:
~/compiler-explorer-poc/example % cd ..
~/compiler-explorer-poc % go mod download google.golang.org/protobuf@latest

This looks like a good solution.

How does this handle updates to @latest? Will a repeat of go mod download ... update the cached version?

stapelberg commented 4 weeks ago

How does this handle updates to @latest? Will a repeat of go mod download ... update the cached version?

Exactly, yes.

Only mentioning this for completeness: By default, the @latest suffix is resolved by the Go module proxy, which might be a few minutes behind. If you need a guaranteed-fresh resolution, you can set GOPROXY=direct to make the go tool contact the upstream source repository directly. I am not recommending this β€” I would recommend sticking to the Go proxy for reliability.

Clement-Jean commented 4 weeks ago

@partouf If you can provide a little bit of guidance, I'd be happy to implement that.

partouf commented 4 weeks ago

@partouf If you can provide a little bit of guidance, I'd be happy to implement that.

Would first start with:

Something like that

Oh and: Feel free to open a draft PR early, we can discuss further from there

partouf commented 4 weeks ago

@stapelberg Does the library code get built when you invoke the build command for the user's code, and if so, where does it write the binaries? And if that happens in the GOPATH, can we avoid that?

stapelberg commented 4 weeks ago

@stapelberg Does the library code get built when you invoke the build command for the user's code, and if so, where does it write the binaries? And if that happens in the GOPATH, can we avoid that?

When you build a program (package main), then:

When you build a library (any other package than main), then:

You can confirm this by running go build -x (or go install -x), which prints all the commands, e.g. for go build -x in protobuf/encoding/protojson:

% go build -x
[…]
/usr/lib/go-1.21/pkg/tool/linux_amd64/compile -o $WORK/b001/_pkg_.a -trimpath "$WORK/b001=>" -p google.golang.org/protobuf/encoding/protojson -lang=go1.20 -complete -buildid SHSRu3r5CQ-_gIh6QmRC/SHSRu3r5CQ-_gIh6QmRC -goversion go1.21.9 -c=4 -nolocalimports -importcfg $WORK/b001/importcfg -pack ./decode.go ./doc.go ./encode.go ./well_known_types.go
/usr/lib/go-1.21/pkg/tool/linux_amd64/buildid -w $WORK/b001/_pkg_.a # internal
cp $WORK/b001/_pkg_.a /home/stapelberg/.cache/go-build/7c/7cefafac275aa8367138fd8eccad87f0d41d96353c0a1b44834ec85109244595-d # internal
% 

GOPATH is not modified when building Go code.

partouf commented 4 weeks ago

@Clement-Jean

Ok so see above; I think the GOCACHE offers a possibility of pre-building the library for a specific go compiler as long as we set it to something like /opt/compiler-explorer/libs/gocache/<compilerid> and then execute go build from the libraries path?

Might be worth an experiment when you're implementing the download

Clement-Jean commented 4 weeks ago

Working on it here

Clement-Jean commented 4 weeks ago

@stapelberg So I tried your example manually and it works perfectly fine. However, I tried it with go tool compile (the command compiler explorer uses) and it doesn't work. That's why I mentioned the importcfg previously.

I know that the Go team recommend using build, but I guess the use of compile is to avoid having modules? We should probably think about this before implementing it. Do we change to the build command or do we create importcfg?

stapelberg commented 4 weeks ago

Can you tell me more than β€œit doesn’t work” please? How can I reproduce the problem you’re running into?

Clement-Jean commented 4 weeks ago

I have the following:

~/compiler-explorer-poc % tree -L 2 .
.
β”œβ”€β”€ example
β”‚Β Β  β”œβ”€β”€ go.mod
β”‚Β Β  └── main.go
└── go
    β”œβ”€β”€ CONTRIBUTING.md
    β”œβ”€β”€ LICENSE
    β”œβ”€β”€ PATENTS
    β”œβ”€β”€ README.md
    β”œβ”€β”€ SECURITY.md
    β”œβ”€β”€ VERSION
    β”œβ”€β”€ api
    β”œβ”€β”€ bin
    β”œβ”€β”€ codereview.cfg
    β”œβ”€β”€ doc
    β”œβ”€β”€ go.env
    β”œβ”€β”€ lib
    β”œβ”€β”€ misc
    β”œβ”€β”€ pkg
    β”œβ”€β”€ src
    └── test

# Configure a new, empty location for the cache (part of GOPATH):
~/compiler-explorer-poc/example % export GOPATH=$HOME/compiler-explorer-poc/go

# https://pkg.go.dev/google.golang.org/protobuf/proto#example-Marshal
~/compiler-explorer-poc/example % go tool compile main.go
main.go:4:2: could not import fmt (file not found)
main.go:6:2: could not import google.golang.org/protobuf/proto (file not found)
main.go:7:2: could not import google.golang.org/protobuf/types/known/durationpb (file not found)

In the past, I had this could not import fmt error. To solve that I tried to rebuild the go compiler with GODEBUG=installgoroot=all ./make.bash.

stapelberg commented 4 weeks ago

Thanks for the details. Yes, the compile program needs an importcfg to know where the object files (.a) are.

I’ll return the question to compiler-explorer folks: (Why?) is it required to call compile directly? Could we use go build instead?

partouf commented 4 weeks ago

Thanks for the details. Yes, the compile program needs an importcfg to know where the object files (.a) are.

I’ll return the question to compiler-explorer folks: (Why?) is it required to call compile directly? Could we use go build instead?

We're only using tool for really old compilers, I think it's safe to ignore.

For the new ones we use build: https://github.com/compiler-explorer/compiler-explorer/blob/538eba393068295eec6eb18fa233a52f61275134/lib/compilers/golang.ts#L247