Open prateek opened 11 months ago
I'd be happy to find some cycles to prototype and do the work to make this happen if I could get pointers to the right places to look.
I'd be happy to find some cycles to prototype and do the work to make this happen if I could get pointers to the right places to look.
Any bash script that has 2 separate steps for 1. to build the library, and 2. to link to the library - will do just fine, you don't have to make it work with CE, we can handle that.
Typically, you don't link/create an object file for a library as a separate step in the Go toolchain. Here's what a simplified example of how one might go about using an external lib in Go (the following commands will work assuming you have a go
binary available in PATH
).
#!/usr/bin/env bash
set -eo pipefail
tmpdir=$(mktemp -d)
cd $tmpdir
cat >main.go <<EOF
// use a library github.com/google/uuid in my Go program
package main
import "github.com/google/uuid"
func main() {
println(uuid.New().String())
}
EOF
go mod init compiler-explorer/testmain
go mod tidy
go run main.go
# or if you want to see the assembly `go build -o main main.go && go tool objdump -S main`
The go mod tidy
command tells the Go toolchain to download the dependency from the appropriate url (it tries to from the local gomodule cache first, and if it doesn't find it there - downloads it - from the package url; or you can make it respect proxies and so on).
Typically, you don't link/create an object file for a library as a separate step in the Go toolchain. Here's what a simplified example of how one might go about using an external lib in Go (the following commands will work assuming you have a
go
binary available inPATH
).#!/usr/bin/env bash set -eo pipefail tmpdir=$(mktemp -d) cd $tmpdir cat >main.go <<EOF // use a library github.com/google/uuid in my Go program package main import "github.com/google/uuid" func main() { println(uuid.New().String()) } EOF go mod init compiler-explorer/testmain go mod tidy go run main.go # or if you want to see the assembly `go build -o main main.go && go tool objdump -S main`
The
go mod tidy
command tells the Go toolchain to download the dependency from the appropriate url (it tries to from the local gomodule cache first, and if it doesn't find it there - downloads it - from the package url; or you can make it respect proxies and so on).
We do not provide any internet connection, but we can go the cache route if we can change the directory from which the modules come from.
But we also need to maintain some kind of versioning and can then be different for every compilation, so it's going to be complicated.
Any links to where we can find details about the caching? I don't think proxying is a good route as that also involves internet, unless we can't customize caching then we can maybe figure something out.
Hmm - if you wanna go the offline route. How do you imagine new libraries/upgrades will be handled?
Re: caching details - Best I can figure out from docs, go
looks for packages in $GOPATH/pkg/mod
first, if it misses, it tries to download them. Hooking into this via the GOPROXY env var seems straight forward - a quick search reveals a directory backed implementation available here.
Re: versioning - yeah that's a fair concern. The upstream Go playground assumes @latest
for version. They had considered exposing some magic comment syntax next to imports to allow overrides, but to my knowledge they haven't done so yet.
AFAIK, go build a compilation graph. The main component of this graph is the importcfg
which is a file containing path to .a
files.
So if you have the following:
package main
import "fmt"
func main() {
fmt.Println("test")
}
what is happening under the hood is the command: go tool compile -importcfg _pkg_.a main.go
, where _pkg_.a
is an archive of the standard library.
Doesn't this mean we could create archives for the libraries, add them all to a global importcfg, and compile against this file?
This would mean that we have a script like this (example for protobuf):
pushd ./third_party/protobuf > /dev/null
pkgs=($(go list -f '{{.ImportPath}}' ./...))
dirs=($(go list -f '{{.Dir}}' ./...))
for idx in "${!pkgs[@]}"
do
cd ${dirs[$idx]}
go build -buildmode archive -o _pkg_.a .
if [ $? -eq 0 ]; then
echo "packagefile ${pkgs[$idx]}=${dirs[$idx]}/_pkg_.a"
fi
done
popd > /dev/null
that create _pkg_.a
files for each subpackages and we could redirect that to the global importcfg (./build_pkg.sh >> importcfg
).
Hello from the Go team π β Iβd be happy to help route detail questions to the right people to move this discussion forward.
The suggested approach of creating an importcfg file and thereby splitting the build into two phases (build .a files, link) should work fine.
However: I wonder if such complexity / the low-level integration is really needed, though?
An easier solution might be to save/restore a populated module cache directory (location depends on GOPATH, ~/go/pkg/mod
by default). The actions/setup-go GitHub Actions step also arranges for the module cache to persist between runs, which means many CI runs donβt need to contact the internet at all.
Note that you donβt need to run a custom proxy to serve the cache β if the cache is present, it will be used.
You can use go mod download
to pre-populate the cache. Hereβs an example transcript:
# Configure a new, empty location for the cache (part of GOPATH):
~/compiler-explorer-poc/example % export GOPATH=$HOME/compiler-explorer-poc/gp
# Demonstrate that the example program (https://pkg.go.dev/google.golang.org/protobuf/proto#example-Marshal) does not build without internet access:
~/compiler-explorer-poc/example % GOPROXY=off go build
go: downloading google.golang.org/protobuf v1.34.1
hello.go:6:2: module lookup disabled by GOPROXY=off
hello.go:7:2: module lookup disabled by GOPROXY=off
# Completely independently of the example program, load Protobuf into the cache:
~/compiler-explorer-poc/example % cd ..
~/compiler-explorer-poc % go mod download google.golang.org/protobuf@latest
# Show that the example program now builds (without any downloads):
~/compiler-explorer-poc % cd example
~/compiler-explorer-poc/example % GOPROXY=off go build
~/compiler-explorer-poc/example %
BTW, if youβre curious, you can use go build -x
to see all the commands that the go tool runs.
You can use
go mod download
to pre-populate the cache. Hereβs an example transcript:# Configure a new, empty location for the cache (part of GOPATH): ~/compiler-explorer-poc/example % export GOPATH=$HOME/compiler-explorer-poc/gp # Completely independently of the example program, load Protobuf into the cache: ~/compiler-explorer-poc/example % cd .. ~/compiler-explorer-poc % go mod download google.golang.org/protobuf@latest
This looks like a good solution.
How does this handle updates to @latest
? Will a repeat of go mod download ...
update the cached version?
How does this handle updates to
@latest
? Will a repeat ofgo mod download ...
update the cached version?
Exactly, yes.
Only mentioning this for completeness: By default, the @latest
suffix is resolved by the Go module proxy, which might be a few minutes behind. If you need a guaranteed-fresh resolution, you can set GOPROXY=direct
to make the go tool contact the upstream source repository directly. I am not recommending this β I would recommend sticking to the Go proxy for reliability.
@partouf If you can provide a little bit of guidance, I'd be happy to implement that.
@partouf If you can provide a little bit of guidance, I'd be happy to implement that.
Would first start with:
- go
top level and seeing what minimal info you need for a go library, probably name, target version and type: golib
golib
here https://github.com/compiler-explorer/infra/blob/ea2996724b485453232dd68d502f0c8819cda8d8/bin/lib/installation.py#L96def makebuildfor(...)
https://github.com/compiler-explorer/infra/blob/ea2996724b485453232dd68d502f0c8819cda8d8/bin/lib/rust_library_builder.py#L438
'go', ['mod', 'download', 'google.golang.org/' + self.libname + '@' + self.target_name]
but with the environment variable GOPATH
set to something like /opt/compiler-explorer/libs/golibs
Something like that
Oh and: Feel free to open a draft PR early, we can discuss further from there
@stapelberg Does the library code get built when you invoke the build command for the user's code, and if so, where does it write the binaries? And if that happens in the GOPATH, can we avoid that?
@stapelberg Does the library code get built when you invoke the build command for the user's code, and if so, where does it write the binaries? And if that happens in the GOPATH, can we avoid that?
When you build a program (package main
), then:
go build
defaults to producing executable files in the current working directory (can be changed with -o
)go install
builds and installs executable files to $GOPATH/bin
, i.e. ~/go/bin
by defaultWhen you build a library (any other package than main
), then:
go build
and go install
will compile the package into a .a
file and place that .a file in $GOCACHE
, i.e. ~/.cache/go-build
by defaultYou can confirm this by running go build -x
(or go install -x
), which prints all the commands, e.g. for go build -x
in protobuf/encoding/protojson
:
% go build -x
[β¦]
/usr/lib/go-1.21/pkg/tool/linux_amd64/compile -o $WORK/b001/_pkg_.a -trimpath "$WORK/b001=>" -p google.golang.org/protobuf/encoding/protojson -lang=go1.20 -complete -buildid SHSRu3r5CQ-_gIh6QmRC/SHSRu3r5CQ-_gIh6QmRC -goversion go1.21.9 -c=4 -nolocalimports -importcfg $WORK/b001/importcfg -pack ./decode.go ./doc.go ./encode.go ./well_known_types.go
/usr/lib/go-1.21/pkg/tool/linux_amd64/buildid -w $WORK/b001/_pkg_.a # internal
cp $WORK/b001/_pkg_.a /home/stapelberg/.cache/go-build/7c/7cefafac275aa8367138fd8eccad87f0d41d96353c0a1b44834ec85109244595-d # internal
%
GOPATH is not modified when building Go code.
@Clement-Jean
Ok so see above; I think the GOCACHE
offers a possibility of pre-building the library for a specific go compiler as long as we set it to something like /opt/compiler-explorer/libs/gocache/<compilerid>
and then execute go build
from the libraries path?
Might be worth an experiment when you're implementing the download
Working on it here
@stapelberg So I tried your example manually and it works perfectly fine. However, I tried it with go tool compile
(the command compiler explorer uses) and it doesn't work. That's why I mentioned the importcfg previously.
I know that the Go team recommend using build
, but I guess the use of compile
is to avoid having modules? We should probably think about this before implementing it. Do we change to the build command or do we create importcfg?
Can you tell me more than βit doesnβt workβ please? How can I reproduce the problem youβre running into?
I have the following:
~/compiler-explorer-poc % tree -L 2 .
.
βββ example
βΒ Β βββ go.mod
βΒ Β βββ main.go
βββ go
βββ CONTRIBUTING.md
βββ LICENSE
βββ PATENTS
βββ README.md
βββ SECURITY.md
βββ VERSION
βββ api
βββ bin
βββ codereview.cfg
βββ doc
βββ go.env
βββ lib
βββ misc
βββ pkg
βββ src
βββ test
# Configure a new, empty location for the cache (part of GOPATH):
~/compiler-explorer-poc/example % export GOPATH=$HOME/compiler-explorer-poc/go
# https://pkg.go.dev/google.golang.org/protobuf/proto#example-Marshal
~/compiler-explorer-poc/example % go tool compile main.go
main.go:4:2: could not import fmt (file not found)
main.go:6:2: could not import google.golang.org/protobuf/proto (file not found)
main.go:7:2: could not import google.golang.org/protobuf/types/known/durationpb (file not found)
In the past, I had this could not import fmt
error. To solve that I tried to rebuild the go compiler with GODEBUG=installgoroot=all ./make.bash
.
Thanks for the details. Yes, the compile
program needs an importcfg to know where the object files (.a) are.
Iβll return the question to compiler-explorer folks: (Why?) is it required to call compile
directly? Could we use go build
instead?
Thanks for the details. Yes, the
compile
program needs an importcfg to know where the object files (.a) are.Iβll return the question to compiler-explorer folks: (Why?) is it required to call
compile
directly? Could we usego build
instead?
We're only using tool
for really old compilers, I think it's safe to ignore.
For the new ones we use build
:
https://github.com/compiler-explorer/compiler-explorer/blob/538eba393068295eec6eb18fa233a52f61275134/lib/compilers/golang.ts#L247
Is your feature request related to a problem? Please describe
My use-case: When using Golang in Compile-Explorer, I often find myself having to "simplify" snippets of external libraries so that I can plug it as input. I can't reference imports outside golang's standard library.
Describe the solution you'd like
Prior art
Describe alternatives you've considered
Today, if I run into issues where I want to check something "small enough" in an external library, I'll do the surgery to rip out the bits I need. If I anticipate it's going to take more longer than 5-10minutes to do so, I resort to not using compiler-explorer for that case.
Additional context
I asked this question in the #compiler_explorer slack channel on cpplang, and was given a few notes for how external library support was handled for Rust. References: