cockroachdb / cockroach

CockroachDB - the open source, cloud-native distributed SQL database.
https://www.cockroachlabs.com
Other
29.6k stars 3.71k forks source link

bazel: nogo crash/segfault #99988

Open knz opened 1 year ago

knz commented 1 year ago

Found here: https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_BazelEssentialCi/9325532?showRootCauses=false&expandBuildChangesSection=true&expandBuildProblemsSection=true&expandBuildTestsSection=true

ERROR: /go/src/github.com/cockroachdb/cockroach/pkg/sql/paramparse/BUILD.bazel:4:11: GoCompilePkg pkg/sql/paramparse/paramparse.a failed: (Exit 1): builder failed: error executing command bazel-out/k8-opt-exec-2B5CBBC6/bin/external/go_sdk/builder compilepkg -sdk external/go_sdk -installsuffix linux_amd64 -tags bazel,gss,bazel,gss -src pkg/sql/paramparse/paramparse.go -src ... (remaining 35 arguments skipped)
Use --sandbox_debug to see verbose messages from the sandbox
compilepkg: panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x5fe13c]
goroutine 167 [running]:
go/types.methodSet.add(0xc0c920?, {0x0, 0x1, 0xc00068427c?}, {0x0, 0x0, 0xc003c04648?}, 0x90?, 0xbb?)
  GOROOT/src/go/types/methodset.go:214 +0xdc
go/types.NewMethodSet({0xc0c920?, 0xc0013b3260?})
  GOROOT/src/go/types/methodset.go:150 +0x869
golang.org/x/tools/go/types/typeutil.(*MethodSetCache).lookupNamed(0xc0003b80e0, 0xc0013b3260)
  golang.org/x/tools/go/types/typeutil/external/org_golang_x_tools/go/types/typeutil/methodsetcache.go:66 +0x99
golang.org/x/tools/go/types/typeutil.(*MethodSetCache).MethodSet(0xc0003b80e0, {0xc0c920?, 0xc0013b3260?})
  golang.org/x/tools/go/types/typeutil/external/org_golang_x_tools/go/types/typeutil/methodsetcache.go:37 +0xe5
golang.org/x/tools/go/ssa.(*Program).needMethods(0xc0003b80c0, {0xc0c920?, 0xc0013b3260?}, 0x0, 0xfa7d40?)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/methods.go:191 +0xf5
golang.org/x/tools/go/ssa.(*Program).needMethods(0xc0003b80c0, {0xc0ca10?, 0xc00145bba8?}, 0x0, 0xfa7d40?)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/methods.go:260 +0x7f5
golang.org/x/tools/go/ssa.(*Program).needMethods(0xc0003b80c0, {0xc0c920?, 0xc000c4ba40?}, 0x0, 0xfa7d40?)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/methods.go:209 +0x1f7
golang.org/x/tools/go/ssa.(*Program).needMethods(0xc0003b80c0, {0xc0ca10?, 0xc000c59890?}, 0x0, 0xfa7d40?)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/methods.go:260 +0x7f5
golang.org/x/tools/go/ssa.(*Program).needMethods(0xc0003b80c0, {0xc0c970?, 0xc000c17890?}, 0x0, 0xfa7d40?)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/methods.go:208 +0x1d0
golang.org/x/tools/go/ssa.(*Program).needMethods(0xc0003b80c0, {0xc0ca10?, 0xc000cf24e0?}, 0x0, 0xfa7d40?)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/methods.go:260 +0x7f5
golang.org/x/tools/go/ssa.(*Program).needMethods(0xc0003b80c0, {0xc0c970?, 0xc000c17ba0?}, 0x0, 0xfa7d40?)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/methods.go:208 +0x1d0
golang.org/x/tools/go/ssa.(*Program).needMethods(0xc0003b80c0, {0xc0ca10?, 0xc000cf2b28?}, 0x0, 0xfa7d40?)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/methods.go:260 +0x7f5
golang.org/x/tools/go/ssa.(*Program).needMethods(0xc0003b80c0, {0xc0c920?, 0xc0004ae8c0?}, 0x0, 0xfa7d40?)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/methods.go:208 +0x1d0
golang.org/x/tools/go/ssa.(*Program).needMethods(0xc0003b80c0, {0xc0ca10?, 0xc000cf2b88?}, 0x0, 0xfa7d40?)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/methods.go:260 +0x7f5
golang.org/x/tools/go/ssa.(*Program).needMethods(0xc0003b80c0, {0xc0c920?, 0xc0003147e0?}, 0x0, 0xfa7d48?)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/methods.go:209 +0x1f7
golang.org/x/tools/go/ssa.(*Program).needMethods(0xc0003b80c0, {0xc0c9e8?, 0xc000da2030?}, 0x1, 0xfa7d40?)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/methods.go:255 +0x6b6
golang.org/x/tools/go/ssa.(*Program).needMethods(0xc0003b80c0, {0xc0c920?, 0xc0001ed9d0?}, 0x0, 0xfa7d40?)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/methods.go:248 +0x589
golang.org/x/tools/go/ssa.(*Program).needMethods(0xc0003b80c0, {0xc0c970?, 0xc0013b8cc0?}, 0x0, 0xfa7d40?)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/methods.go:220 +0x5d1
golang.org/x/tools/go/ssa.(*Program).needMethods(0xc0003b80c0, {0xc0ca10?, 0xc00145b518?}, 0x0, 0xfa7d40?)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/methods.go:260 +0x7f5
golang.org/x/tools/go/ssa.(*Program).needMethods(0xc0003b80c0, {0xc0c998?, 0xc001470000?}, 0x0, 0x50347e?)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/methods.go:236 +0x314
golang.org/x/tools/go/ssa.(*Program).needMethodsOf(0xc0003b80c0, {0xc0c998?, 0xc001470000?}, 0x7225cf?)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/methods.go:172 +0x7f
golang.org/x/tools/go/ssa.(*Package).build(0xc000e9e300)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/builder.go:2427 +0x133
sync.(*Once).doSlow(0xc0003b80c0?, 0xc0004537c0?)
  GOROOT/src/sync/once.go:74 +0xc2
sync.(*Once).Do(...)
  GOROOT/src/sync/once.go:65
golang.org/x/tools/go/ssa.(*Package).Build(...)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/builder.go:2413
golang.org/x/tools/go/analysis/passes/buildssa.run(0xc0001f9ee0)
  golang.org/x/tools/go/analysis/passes/buildssa/external/org_golang_x_tools/go/analysis/passes/buildssa/buildssa.go:73 +0x1a8
main.(*action).execOnce(0xc00032e510)
  main/external/io_bazel_rules_go/go/tools/builders/nogo_main.go:402 +0x842
sync.(*Once).doSlow(0x0?, 0x0?)
  GOROOT/src/sync/once.go:74 +0xc2
sync.(*Once).Do(...)
  GOROOT/src/sync/once.go:65
main.(*action).exec(0x0?)
  main/external/io_bazel_rules_go/go/tools/builders/nogo_main.go:346 +0x3d
main.execAll.func1(0x0?)
  main/external/io_bazel_rules_go/go/tools/builders/nogo_main.go:340 +0x54
created by main.execAll
  main/external/io_bazel_rules_go/go/tools/builders/nogo_main.go:338 +0x47
INFO: Elapsed time: 112.761s, Critical Path: 85.28s
INFO: 3510 processes: 45 internal, 3465 processwrapper-sandbox.
FAILED: Build did NOT complete successfully
FAILED: Build did NOT complete successfully

cc @rickystewart for triage

Jira issue: CRDB-26167

Epic CRDB-36213

rickystewart commented 1 year ago

@knz Can you push this commit ce560e93047a somewhere so I have a way to repro? Looks like I can't fetch it from your repo right now (maybe you force-pushed)?

knz commented 1 year ago

It's embededded in the crdb master branch now. (same SHA)

msbutler commented 1 year ago

just hit this again https://teamcity.cockroachdb.com/viewLog.html?buildId=10314501&buildTypeId=Cockroach_BazelEssentialCi

msbutler commented 1 year ago

@stevendanna shall i assign this to you? I know you were toying with this.

stevendanna commented 1 year ago

😭 I was actually chasing a different panic.

In my case, I wasn't able to get a good reproduction easily. I wonder if bumping x/tools/go/ssa on master might be reasonable blind step to take.

msbutler commented 1 year ago

just hit another buildssa nogo crash https://teamcity.cockroachdb.com/viewLog.html?buildId=10789412&problemId=8889

msbutler commented 1 year ago

@rickystewart currently master uses version v0.6.0 of golang.org/x/tools. It seems the latest release of this package is v0.10.0. Do you know of anything blocking this vendor upgrade?

rickystewart commented 1 year ago

@msbutler I don't know of anything blocking, but it's not up to me --this is one of the "core dependencies" to the database so we need a wider audience for a review. @knz Are you aware of any specific problems w upgrading golang.org/x/tools to latest/are you OK if we get a PR out there to see if this reduces nogo flakiness?

knz commented 1 year ago

latest x/tools seems reasonable but let's also see how many other things that pulls in :)

msbutler commented 1 year ago

ok lemme see what happens.

msbutler commented 1 year ago

@knz wdyt? diff on master after running go get golang.org/x/tools@latest

diff --git a/go.mod b/go.mod
index f67880c437f..a2b339c3a33 100644
--- a/go.mod
+++ b/go.mod
@@ -16,17 +16,17 @@ require (
        github.com/google/btree v1.0.1
        github.com/google/pprof v0.0.0-20210827144239-02619b876842
        github.com/google/uuid v1.3.0
-       golang.org/x/crypto v0.7.0
+       golang.org/x/crypto v0.11.0
        golang.org/x/exp v0.0.0-20220827204233-334a2380cb91
        golang.org/x/exp/typeparams v0.0.0-20221208152030-732eee02a75a // indirect
-       golang.org/x/mod v0.8.0 // indirect
-       golang.org/x/net v0.8.0
+       golang.org/x/mod v0.12.0 // indirect
+       golang.org/x/net v0.12.0
        golang.org/x/oauth2 v0.5.0
-       golang.org/x/sync v0.1.0
-       golang.org/x/sys v0.6.0
-       golang.org/x/text v0.8.0
+       golang.org/x/sync v0.3.0
+       golang.org/x/sys v0.10.0
+       golang.org/x/text v0.11.0
        golang.org/x/time v0.1.0
-       golang.org/x/tools v0.6.0
+       golang.org/x/tools v0.11.0
        google.golang.org/api v0.110.0
        google.golang.org/genproto v0.0.0-20230227214838-9b19f0bdc514
        google.golang.org/grpc v1.53.0
@@ -227,7 +227,7 @@ require (
        go.opentelemetry.io/otel/sdk v1.0.0-RC3
        go.opentelemetry.io/otel/trace v1.0.0-RC3
        golang.org/x/perf v0.0.0-20230113213139-801c7ef9e5c5
-       golang.org/x/term v0.6.0
+       golang.org/x/term v0.10.0
        gopkg.in/yaml.v2 v2.4.0
        gopkg.in/yaml.v3 v3.0.1
        honnef.co/go/tools v0.4.3
diff --git a/go.sum b/go.sum
knz commented 1 year ago

Sgtm!

msbutler commented 1 year ago

@rickystewart when I attempt to run ./dev gen bazel --mirror on top of this update, I get the following error:

ERROR: /private/var/tmp/_bazel_michaelbutler/13ba282fa2b19539d0c969c1113bb37c/external/bazel_gazelle/repo/BUILD.bazel:3:11: no such package '@org_golang_x_tools//go/vcs': BUILD file not found in directory 'go/vcs' of external repository @org_golang_x_tools. Add a BUILD file to a directory to mark it as a package. and referenced by '@bazel_gazelle//repo:repo'
ERROR: Analysis of target '//:gazelle' failed; build aborted:
INFO: Elapsed time: 0.570s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (22 packages loaded, 1456 targets configured)
    currently loading: @com_github_bazelbuild_buildtools//build
ERROR: Build failed. Not running target
ERROR: exit status 1

I have seen an internal thread which indicates that a bump of golang.org/x/tools is incompatible with our rules_go framework. Is this suprising? Happy to take this to slack to troubleshoot. Fwiw, here's the output of bazel version:

❯ bazel version
Bazelisk version: development
Build label: 6.2.1
Build target: bazel-out/darwin-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Fri Jun 2 17:07:06 2023 (1685725626)
Build timestamp: 1685725626
Build timestamp as int: 1685725626
rickystewart commented 1 year ago

We would probably have to upgrade rules_go in tandem. I can help with this, we just need to cherry-pick our patches on top of the latest upstream.

msbutler commented 1 year ago

well, it looks like updating x/tools to v0.10.0 avoids the rules_go problem, so I'll open a PR to bump to just v0.10.0 for now to avoid this coordination dance.

rickystewart commented 1 year ago

@msbutler Try the following patch.

diff --git a/WORKSPACE b/WORKSPACE
index 5c57e914b3d..823bca48a5b 100644
--- a/WORKSPACE
+++ b/WORKSPACE
@@ -8,12 +8,12 @@ load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
 # Load go bazel tools. This gives us access to the go bazel SDK/toolchains.
 http_archive(
     name = "io_bazel_rules_go",
-    sha256 = "97701a677263ae017c19332cf94f14221773617fa3c8410743ac7ff7fc6b8b38",
-    strip_prefix = "cockroachdb-rules_go-c0680b8",
+    sha256 = "7ab77b5bd3ac04a65860b0e26f2855c977d463d8e9b5ce2458e516b110eb5eeb",
+    strip_prefix = "cockroachdb-rules_go-f1ab269",
     urls = [
-        # cockroachdb/rules_go as of c0680b8a52933071996ed4c022677f7a9b701727
-        # (upstream release-0.38 plus a few patches).
-        "https://storage.googleapis.com/public-bazel-artifacts/bazel/cockroachdb-rules_go-v0.27.0-265-gc0680b8.tar.gz",
+        # cockroachdb/rules_go as of f1ab26925b5da24119d38115a657f27a90288db5
+        # (upstream release-0.40 plus a few patches).
+        "https://storage.googleapis.com/public-bazel-artifacts/bazel/cockroachdb-rules_go-v0.27.0-341-gf1ab269.tar.gz",
     ]
 )

Michael, instead of upgrading golang.org/x/tools to latest, can you instead go to v0.7.0? I ask because this is the version that's vendored with the latest rules_go, so they should be compatible with each other and shouldn't result in these segfaults. Should be a less intrusive upgrade altogether anyway.

msbutler commented 1 year ago

yup, i'll open a PR for a bump to v0.7.0. FWIW, ./dev gen bazel did work when i bumped to v0.10.0

knz commented 1 year ago

Looks like the dep is good now.

rickystewart commented 1 year ago

FYI: Closed #98260 as a dup of this one.

otan commented 1 year ago

ran into this on a bors run: https://github.com/cockroachdb/cockroach/pull/106671#issuecomment-1637226155

rickystewart commented 12 months ago

107960 seems like a new instance of this, not sure what's going on.

rickystewart commented 11 months ago

108351 as well.

rickystewart commented 11 months ago

Looks like @msbutler's original change worked for a while but didn't help in the long term. #108858 to bump some dependencies even further. @knz can you have a look?

rickystewart commented 11 months ago

The results look good for now. I'm going to backport this to release-23.1 and then we'll have more data.

rickystewart commented 11 months ago

Saw another one in CI this time! https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_Ci_Tests_Bench/11490932?hideProblemsFromDependencies=false&hideTestsFromDependencies=false&expandBuildChangesSection=true&expandBuildProblemsSection=true

rickystewart commented 11 months ago

Re-running that build at the same commit does not fail in the same way, so the problem is intermittent.

lidorcarmel commented 10 months ago

just saw this in my pr on master now: https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_BazelExtendedCi/11679216?hideProblemsFromDependencies=false&hideTestsFromDependencies=false&expandBuildChangesSection=true&expandBuildProblemsSection=true

rickystewart commented 10 months ago

I think we were looking pretty good here for a while, but maybe the Go 1.20 upgrade made things worse. 🤔 Here's another failure on staging.

j82w commented 10 months ago

Another failure: https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_BazelEssentialCi/11727154?buildTab=overview&expandBuildProblemsSection=true&hideProblemsFromDependencies=false&hideTestsFromDependencies=false&expandBuildChangesSection=true

rickystewart commented 9 months ago

I have #112635 out to upgrade some stuff slightly to try to address the problem: it's a small upgrade, but we can go no further on x/tools without hitting https://github.com/bazelbuild/rules_go/issues/3640. When https://github.com/bazelbuild/rules_go/pull/3730 lands, then we can upgrade rules_go further and maybe make further progress on this.

rafiss commented 4 months ago

Another occurrence: https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_UnitTests_BazelUnitTests/14291156?hideProblemsFromDependencies=false&hideTestsFromDependencies=false&expandBuildChangesSection=true&expandBuildProblemsSection=true