cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.07k stars 3.8k forks source link

pkg/sql/logictest/tests/local-mixed-23.2/local-mixed-23_2_test: TestLogic_union failed [aborted in DistSender: result is ambiguous] #127942

Open cockroach-teamcity opened 2 months ago

cockroach-teamcity commented 2 months ago

pkg/sql/logictest/tests/local-mixed-23.2/local-mixed-23_2_test.TestLogic_union failed with artifacts on master @ c7d1ceed692d36e9d7c5a71e7122408e2e3e3b8b:

=== RUN   TestLogic_union
    test_log_scope.go:170: test logs captured to: /artifacts/tmp/_tmp/5f7b4dcd658a4aaead5502f982e54f03/logTestLogic_union1559883951
    test_log_scope.go:81: use -show-logs to present logs inline
    test_server_shim.go:144: cluster virtualization disabled in global scope due to issue: #76378 (expected label: C-bug)
    logic.go:2968: 
         pq: txn already encountered an error; cannot be used anymore (previous err: aborted in DistSender: result is ambiguous: context canceled)
[05:48:47] --- done: /home/roach/.cache/bazel/_bazel_roach/c5a4e7d36696d9cd970af2045211a7df/sandbox/processwrapper-sandbox/3513/execroot/com_github_cockroachdb_cockroach/bazel-out/k8-fastbuild/bin/pkg/sql/logictest/tests/local-mixed-23.2/local-mixed-23_2_test_/local-mixed-23_2_test.runfiles/com_github_cockroachdb_cockroach/pkg/sql/logictest/testdata/logic_test/union with config local-mixed-23.2: 102 tests, 1 failures
[05:48:49] --- total progress: 102 statements
--- total: 102 tests, 1 failures
    logic.go:4308: -- test log scope end --
test logs left over in: /artifacts/tmp/_tmp/5f7b4dcd658a4aaead5502f982e54f03/logTestLogic_union1559883951
--- FAIL: TestLogic_union (4.19s)
Help

See also: [How To Investigate a Go Test Failure \(internal\)](https://cockroachlabs.atlassian.net/l/c/HgfXfJgM)

/cc @cockroachdb/sql-foundations

This test on roachdash | Improve this report!

Jira issue: CRDB-40681

michae2 commented 2 months ago

The failure in #128064 shows the testcase that was added by #127076, so that's a hint.

michae2 commented 2 months ago

I can hit this after about 400 runs using the following on a gceworker:

./dev testlogic base --config=local-legacy-schema-changer --files=union --ignore-cache --stress
rytaft commented 2 months ago

The failure in https://github.com/cockroachdb/cockroach/issues/128064 shows the testcase that was added by https://github.com/cockroachdb/cockroach/pull/127076, so that's a hint.

The backports of https://github.com/cockroachdb/cockroach/pull/127076 are still open. Do you think merging them will fix this issue, @michae2? cc @mgartner who is assigned to review those backports and merge them when ready.

michae2 commented 2 months ago

The backports of https://github.com/cockroachdb/cockroach/pull/127076 are still open. Do you think merging them will fix this issue, @michae2?

No, I think it was likely #127076 which causes this failure, so we should hold off on merging the backports until we fix this.

rytaft commented 2 months ago

Ohh I see, these failures are all on branch-master. I got confused because the test is 23.2 mixed version (or 24.1 in the other issue).

rytaft commented 2 months ago

In that case, we should definitely address this issue before 24.3. I'll mark this as a GA-blocker and apply P-2. We might need to look into this before @yuzefovich gets back (especially if these test failures seem to be relatively common).

rafiss commented 3 weeks ago

I made this PR to attempt to deflake the test: https://github.com/cockroachdb/cockroach/pull/131680 (since it also caused https://github.com/cockroachdb/cockroach/issues/131324 to be opened, which landed on the SQL Foundations board). I haven't been able to repro under stress since merging that.

edit: looks like it failed in https://github.com/cockroachdb/cockroach/issues/132038

mgartner commented 1 week ago

It also failed in #132780.

mgartner commented 4 days ago

It also failed in #133235.