cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.19k stars 3.82k forks source link

bazel: nogo crash/segfault #99988

Open knz opened 1 year ago

knz commented 1 year ago

Found here: https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_BazelEssentialCi/9325532?showRootCauses=false&expandBuildChangesSection=true&expandBuildProblemsSection=true&expandBuildTestsSection=true

ERROR: /go/src/github.com/cockroachdb/cockroach/pkg/sql/paramparse/BUILD.bazel:4:11: GoCompilePkg pkg/sql/paramparse/paramparse.a failed: (Exit 1): builder failed: error executing command bazel-out/k8-opt-exec-2B5CBBC6/bin/external/go_sdk/builder compilepkg -sdk external/go_sdk -installsuffix linux_amd64 -tags bazel,gss,bazel,gss -src pkg/sql/paramparse/paramparse.go -src ... (remaining 35 arguments skipped)
Use --sandbox_debug to see verbose messages from the sandbox
compilepkg: panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x5fe13c]
goroutine 167 [running]:
go/types.methodSet.add(0xc0c920?, {0x0, 0x1, 0xc00068427c?}, {0x0, 0x0, 0xc003c04648?}, 0x90?, 0xbb?)
  GOROOT/src/go/types/methodset.go:214 +0xdc
go/types.NewMethodSet({0xc0c920?, 0xc0013b3260?})
  GOROOT/src/go/types/methodset.go:150 +0x869
golang.org/x/tools/go/types/typeutil.(*MethodSetCache).lookupNamed(0xc0003b80e0, 0xc0013b3260)
  golang.org/x/tools/go/types/typeutil/external/org_golang_x_tools/go/types/typeutil/methodsetcache.go:66 +0x99
golang.org/x/tools/go/types/typeutil.(*MethodSetCache).MethodSet(0xc0003b80e0, {0xc0c920?, 0xc0013b3260?})
  golang.org/x/tools/go/types/typeutil/external/org_golang_x_tools/go/types/typeutil/methodsetcache.go:37 +0xe5
golang.org/x/tools/go/ssa.(*Program).needMethods(0xc0003b80c0, {0xc0c920?, 0xc0013b3260?}, 0x0, 0xfa7d40?)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/methods.go:191 +0xf5
golang.org/x/tools/go/ssa.(*Program).needMethods(0xc0003b80c0, {0xc0ca10?, 0xc00145bba8?}, 0x0, 0xfa7d40?)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/methods.go:260 +0x7f5
golang.org/x/tools/go/ssa.(*Program).needMethods(0xc0003b80c0, {0xc0c920?, 0xc000c4ba40?}, 0x0, 0xfa7d40?)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/methods.go:209 +0x1f7
golang.org/x/tools/go/ssa.(*Program).needMethods(0xc0003b80c0, {0xc0ca10?, 0xc000c59890?}, 0x0, 0xfa7d40?)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/methods.go:260 +0x7f5
golang.org/x/tools/go/ssa.(*Program).needMethods(0xc0003b80c0, {0xc0c970?, 0xc000c17890?}, 0x0, 0xfa7d40?)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/methods.go:208 +0x1d0
golang.org/x/tools/go/ssa.(*Program).needMethods(0xc0003b80c0, {0xc0ca10?, 0xc000cf24e0?}, 0x0, 0xfa7d40?)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/methods.go:260 +0x7f5
golang.org/x/tools/go/ssa.(*Program).needMethods(0xc0003b80c0, {0xc0c970?, 0xc000c17ba0?}, 0x0, 0xfa7d40?)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/methods.go:208 +0x1d0
golang.org/x/tools/go/ssa.(*Program).needMethods(0xc0003b80c0, {0xc0ca10?, 0xc000cf2b28?}, 0x0, 0xfa7d40?)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/methods.go:260 +0x7f5
golang.org/x/tools/go/ssa.(*Program).needMethods(0xc0003b80c0, {0xc0c920?, 0xc0004ae8c0?}, 0x0, 0xfa7d40?)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/methods.go:208 +0x1d0
golang.org/x/tools/go/ssa.(*Program).needMethods(0xc0003b80c0, {0xc0ca10?, 0xc000cf2b88?}, 0x0, 0xfa7d40?)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/methods.go:260 +0x7f5
golang.org/x/tools/go/ssa.(*Program).needMethods(0xc0003b80c0, {0xc0c920?, 0xc0003147e0?}, 0x0, 0xfa7d48?)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/methods.go:209 +0x1f7
golang.org/x/tools/go/ssa.(*Program).needMethods(0xc0003b80c0, {0xc0c9e8?, 0xc000da2030?}, 0x1, 0xfa7d40?)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/methods.go:255 +0x6b6
golang.org/x/tools/go/ssa.(*Program).needMethods(0xc0003b80c0, {0xc0c920?, 0xc0001ed9d0?}, 0x0, 0xfa7d40?)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/methods.go:248 +0x589
golang.org/x/tools/go/ssa.(*Program).needMethods(0xc0003b80c0, {0xc0c970?, 0xc0013b8cc0?}, 0x0, 0xfa7d40?)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/methods.go:220 +0x5d1
golang.org/x/tools/go/ssa.(*Program).needMethods(0xc0003b80c0, {0xc0ca10?, 0xc00145b518?}, 0x0, 0xfa7d40?)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/methods.go:260 +0x7f5
golang.org/x/tools/go/ssa.(*Program).needMethods(0xc0003b80c0, {0xc0c998?, 0xc001470000?}, 0x0, 0x50347e?)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/methods.go:236 +0x314
golang.org/x/tools/go/ssa.(*Program).needMethodsOf(0xc0003b80c0, {0xc0c998?, 0xc001470000?}, 0x7225cf?)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/methods.go:172 +0x7f
golang.org/x/tools/go/ssa.(*Package).build(0xc000e9e300)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/builder.go:2427 +0x133
sync.(*Once).doSlow(0xc0003b80c0?, 0xc0004537c0?)
  GOROOT/src/sync/once.go:74 +0xc2
sync.(*Once).Do(...)
  GOROOT/src/sync/once.go:65
golang.org/x/tools/go/ssa.(*Package).Build(...)
  golang.org/x/tools/go/ssa/external/org_golang_x_tools/go/ssa/builder.go:2413
golang.org/x/tools/go/analysis/passes/buildssa.run(0xc0001f9ee0)
  golang.org/x/tools/go/analysis/passes/buildssa/external/org_golang_x_tools/go/analysis/passes/buildssa/buildssa.go:73 +0x1a8
main.(*action).execOnce(0xc00032e510)
  main/external/io_bazel_rules_go/go/tools/builders/nogo_main.go:402 +0x842
sync.(*Once).doSlow(0x0?, 0x0?)
  GOROOT/src/sync/once.go:74 +0xc2
sync.(*Once).Do(...)
  GOROOT/src/sync/once.go:65
main.(*action).exec(0x0?)
  main/external/io_bazel_rules_go/go/tools/builders/nogo_main.go:346 +0x3d
main.execAll.func1(0x0?)
  main/external/io_bazel_rules_go/go/tools/builders/nogo_main.go:340 +0x54
created by main.execAll
  main/external/io_bazel_rules_go/go/tools/builders/nogo_main.go:338 +0x47
INFO: Elapsed time: 112.761s, Critical Path: 85.28s
INFO: 3510 processes: 45 internal, 3465 processwrapper-sandbox.
FAILED: Build did NOT complete successfully
FAILED: Build did NOT complete successfully

cc @rickystewart for triage

Jira issue: CRDB-26167

Epic CRDB-36213

rickystewart commented 1 year ago

@knz Can you push this commit ce560e93047a somewhere so I have a way to repro? Looks like I can't fetch it from your repo right now (maybe you force-pushed)?

knz commented 1 year ago

It's embededded in the crdb master branch now. (same SHA)

msbutler commented 1 year ago

just hit this again https://teamcity.cockroachdb.com/viewLog.html?buildId=10314501&buildTypeId=Cockroach_BazelEssentialCi

msbutler commented 1 year ago

@stevendanna shall i assign this to you? I know you were toying with this.

stevendanna commented 1 year ago

😭 I was actually chasing a different panic.

In my case, I wasn't able to get a good reproduction easily. I wonder if bumping x/tools/go/ssa on master might be reasonable blind step to take.

msbutler commented 1 year ago

just hit another buildssa nogo crash https://teamcity.cockroachdb.com/viewLog.html?buildId=10789412&problemId=8889

msbutler commented 1 year ago

@rickystewart currently master uses version v0.6.0 of golang.org/x/tools. It seems the latest release of this package is v0.10.0. Do you know of anything blocking this vendor upgrade?

rickystewart commented 1 year ago

@msbutler I don't know of anything blocking, but it's not up to me --this is one of the "core dependencies" to the database so we need a wider audience for a review. @knz Are you aware of any specific problems w upgrading golang.org/x/tools to latest/are you OK if we get a PR out there to see if this reduces nogo flakiness?

knz commented 1 year ago

latest x/tools seems reasonable but let's also see how many other things that pulls in :)

msbutler commented 1 year ago

ok lemme see what happens.

msbutler commented 1 year ago

@knz wdyt? diff on master after running go get golang.org/x/tools@latest

diff --git a/go.mod b/go.mod
index f67880c437f..a2b339c3a33 100644
--- a/go.mod
+++ b/go.mod
@@ -16,17 +16,17 @@ require (
        github.com/google/btree v1.0.1
        github.com/google/pprof v0.0.0-20210827144239-02619b876842
        github.com/google/uuid v1.3.0
-       golang.org/x/crypto v0.7.0
+       golang.org/x/crypto v0.11.0
        golang.org/x/exp v0.0.0-20220827204233-334a2380cb91
        golang.org/x/exp/typeparams v0.0.0-20221208152030-732eee02a75a // indirect
-       golang.org/x/mod v0.8.0 // indirect
-       golang.org/x/net v0.8.0
+       golang.org/x/mod v0.12.0 // indirect
+       golang.org/x/net v0.12.0
        golang.org/x/oauth2 v0.5.0
-       golang.org/x/sync v0.1.0
-       golang.org/x/sys v0.6.0
-       golang.org/x/text v0.8.0
+       golang.org/x/sync v0.3.0
+       golang.org/x/sys v0.10.0
+       golang.org/x/text v0.11.0
        golang.org/x/time v0.1.0
-       golang.org/x/tools v0.6.0
+       golang.org/x/tools v0.11.0
        google.golang.org/api v0.110.0
        google.golang.org/genproto v0.0.0-20230227214838-9b19f0bdc514
        google.golang.org/grpc v1.53.0
@@ -227,7 +227,7 @@ require (
        go.opentelemetry.io/otel/sdk v1.0.0-RC3
        go.opentelemetry.io/otel/trace v1.0.0-RC3
        golang.org/x/perf v0.0.0-20230113213139-801c7ef9e5c5
-       golang.org/x/term v0.6.0
+       golang.org/x/term v0.10.0
        gopkg.in/yaml.v2 v2.4.0
        gopkg.in/yaml.v3 v3.0.1
        honnef.co/go/tools v0.4.3
diff --git a/go.sum b/go.sum
knz commented 1 year ago

Sgtm!

msbutler commented 1 year ago

@rickystewart when I attempt to run ./dev gen bazel --mirror on top of this update, I get the following error:

ERROR: /private/var/tmp/_bazel_michaelbutler/13ba282fa2b19539d0c969c1113bb37c/external/bazel_gazelle/repo/BUILD.bazel:3:11: no such package '@org_golang_x_tools//go/vcs': BUILD file not found in directory 'go/vcs' of external repository @org_golang_x_tools. Add a BUILD file to a directory to mark it as a package. and referenced by '@bazel_gazelle//repo:repo'
ERROR: Analysis of target '//:gazelle' failed; build aborted:
INFO: Elapsed time: 0.570s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (22 packages loaded, 1456 targets configured)
    currently loading: @com_github_bazelbuild_buildtools//build
ERROR: Build failed. Not running target
ERROR: exit status 1

I have seen an internal thread which indicates that a bump of golang.org/x/tools is incompatible with our rules_go framework. Is this suprising? Happy to take this to slack to troubleshoot. Fwiw, here's the output of bazel version:

❯ bazel version
Bazelisk version: development
Build label: 6.2.1
Build target: bazel-out/darwin-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Fri Jun 2 17:07:06 2023 (1685725626)
Build timestamp: 1685725626
Build timestamp as int: 1685725626
rickystewart commented 1 year ago

We would probably have to upgrade rules_go in tandem. I can help with this, we just need to cherry-pick our patches on top of the latest upstream.

msbutler commented 1 year ago

well, it looks like updating x/tools to v0.10.0 avoids the rules_go problem, so I'll open a PR to bump to just v0.10.0 for now to avoid this coordination dance.

rickystewart commented 1 year ago

@msbutler Try the following patch.

diff --git a/WORKSPACE b/WORKSPACE
index 5c57e914b3d..823bca48a5b 100644
--- a/WORKSPACE
+++ b/WORKSPACE
@@ -8,12 +8,12 @@ load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
 # Load go bazel tools. This gives us access to the go bazel SDK/toolchains.
 http_archive(
     name = "io_bazel_rules_go",
-    sha256 = "97701a677263ae017c19332cf94f14221773617fa3c8410743ac7ff7fc6b8b38",
-    strip_prefix = "cockroachdb-rules_go-c0680b8",
+    sha256 = "7ab77b5bd3ac04a65860b0e26f2855c977d463d8e9b5ce2458e516b110eb5eeb",
+    strip_prefix = "cockroachdb-rules_go-f1ab269",
     urls = [
-        # cockroachdb/rules_go as of c0680b8a52933071996ed4c022677f7a9b701727
-        # (upstream release-0.38 plus a few patches).
-        "https://storage.googleapis.com/public-bazel-artifacts/bazel/cockroachdb-rules_go-v0.27.0-265-gc0680b8.tar.gz",
+        # cockroachdb/rules_go as of f1ab26925b5da24119d38115a657f27a90288db5
+        # (upstream release-0.40 plus a few patches).
+        "https://storage.googleapis.com/public-bazel-artifacts/bazel/cockroachdb-rules_go-v0.27.0-341-gf1ab269.tar.gz",
     ]
 )

Michael, instead of upgrading golang.org/x/tools to latest, can you instead go to v0.7.0? I ask because this is the version that's vendored with the latest rules_go, so they should be compatible with each other and shouldn't result in these segfaults. Should be a less intrusive upgrade altogether anyway.

msbutler commented 1 year ago

yup, i'll open a PR for a bump to v0.7.0. FWIW, ./dev gen bazel did work when i bumped to v0.10.0

knz commented 1 year ago

Looks like the dep is good now.

rickystewart commented 1 year ago

FYI: Closed #98260 as a dup of this one.

otan commented 1 year ago

ran into this on a bors run: https://github.com/cockroachdb/cockroach/pull/106671#issuecomment-1637226155

rickystewart commented 1 year ago

107960 seems like a new instance of this, not sure what's going on.

rickystewart commented 1 year ago

108351 as well.

rickystewart commented 1 year ago

Looks like @msbutler's original change worked for a while but didn't help in the long term. #108858 to bump some dependencies even further. @knz can you have a look?

rickystewart commented 1 year ago

The results look good for now. I'm going to backport this to release-23.1 and then we'll have more data.

rickystewart commented 1 year ago

Saw another one in CI this time! https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_Ci_Tests_Bench/11490932?hideProblemsFromDependencies=false&hideTestsFromDependencies=false&expandBuildChangesSection=true&expandBuildProblemsSection=true

rickystewart commented 1 year ago

Re-running that build at the same commit does not fail in the same way, so the problem is intermittent.

lidorcarmel commented 1 year ago

just saw this in my pr on master now: https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_BazelExtendedCi/11679216?hideProblemsFromDependencies=false&hideTestsFromDependencies=false&expandBuildChangesSection=true&expandBuildProblemsSection=true

rickystewart commented 1 year ago

I think we were looking pretty good here for a while, but maybe the Go 1.20 upgrade made things worse. 🤔 Here's another failure on staging.

j82w commented 1 year ago

Another failure: https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_BazelEssentialCi/11727154?buildTab=overview&expandBuildProblemsSection=true&hideProblemsFromDependencies=false&hideTestsFromDependencies=false&expandBuildChangesSection=true

rickystewart commented 1 year ago

I have #112635 out to upgrade some stuff slightly to try to address the problem: it's a small upgrade, but we can go no further on x/tools without hitting https://github.com/bazelbuild/rules_go/issues/3640. When https://github.com/bazelbuild/rules_go/pull/3730 lands, then we can upgrade rules_go further and maybe make further progress on this.

rafiss commented 8 months ago

Another occurrence: https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_UnitTests_BazelUnitTests/14291156?hideProblemsFromDependencies=false&hideTestsFromDependencies=false&expandBuildChangesSection=true&expandBuildProblemsSection=true