Closed cockroach-teamcity closed 2 years ago
roachtest.kv/splits/nodes=3/quiesce=true failed with artifacts on master @ 51c8aae748d338549400c047796c6c9b892527da:
| r12 0xc000086a00
| r13 0x1
| r14 0xc0021021a0
| r15 0xffffffffffffffff
| rip 0x49a101
| rflags 0x286
| cs 0x33
| fs 0x0
| gs 0x0
|
| stdout:
Wraps: (4) SSH_PROBLEM
Wraps: (5) Node 4. Command with error:
| ``````
| ./workload run kv --init --max-ops=1 --concurrency=192 --splits=300000 {pgurl:1-3}
| ``````
Wraps: (6) exit status 255
Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *cluster.WithCommandDetails (4) errors.SSH (5) *hintdetail.withDetail (6) *exec.ExitError
monitor.go:127,kv.go:729,test_runner.go:928: monitor failure: monitor task failed: t.Fatal() was called
(1) attached stack trace
-- stack trace:
| main.(*monitorImpl).WaitE
| main/pkg/cmd/roachtest/monitor.go:115
| main.(*monitorImpl).Wait
| main/pkg/cmd/roachtest/monitor.go:123
| github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerKVSplits.func1
| github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/kv.go:729
| main.(*testRunner).runTest.func2
| main/pkg/cmd/roachtest/test_runner.go:928
Wraps: (2) monitor failure
Wraps: (3) attached stack trace
-- stack trace:
| main.(*monitorImpl).wait.func2
| main/pkg/cmd/roachtest/monitor.go:171
Wraps: (4) monitor task failed
Wraps: (5) attached stack trace
-- stack trace:
| main.init
| main/pkg/cmd/roachtest/monitor.go:80
| runtime.doInit
| GOROOT/src/runtime/proc.go:6340
| runtime.main
| GOROOT/src/runtime/proc.go:233
| runtime.goexit
| GOROOT/src/runtime/asm_amd64.s:1594
Wraps: (6) t.Fatal() was called
Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.withPrefix (5) *withstack.withStack (6) *errutil.leafError
test_runner.go:1059,test_runner.go:958: test timed out (2h0m0s)
Parameters: ROACHTEST_cloud=gce
, ROACHTEST_cpu=4
, ROACHTEST_encrypted=false
, ROACHTEST_ssd=0
See: [roachtest README](https://github.com/cockroachdb/cockroach/blob/master/pkg/cmd/roachtest/README.md) See: [How To Investigate \(internal\)](https://cockroachlabs.atlassian.net/l/c/SSSBr8c7)
roachtest.kv/splits/nodes=3/quiesce=true failed with artifacts on master @ ff13325e9368c4e8dd9a4d5cf4aa2ad2f33e9ac0:
| r12 0x358
| r13 0x3
| r14 0xc000102b60
| r15 0x1
| rip 0x49a101
| rflags 0x286
| cs 0x33
| fs 0x0
| gs 0x0
|
| stdout:
Wraps: (4) SSH_PROBLEM
Wraps: (5) Node 4. Command with error:
| ``````
| ./workload run kv --init --max-ops=1 --concurrency=192 --splits=300000 {pgurl:1-3}
| ``````
Wraps: (6) exit status 255
Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *cluster.WithCommandDetails (4) errors.SSH (5) *hintdetail.withDetail (6) *exec.ExitError
monitor.go:127,kv.go:729,test_runner.go:928: monitor failure: monitor task failed: t.Fatal() was called
(1) attached stack trace
-- stack trace:
| main.(*monitorImpl).WaitE
| main/pkg/cmd/roachtest/monitor.go:115
| main.(*monitorImpl).Wait
| main/pkg/cmd/roachtest/monitor.go:123
| github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerKVSplits.func1
| github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/kv.go:729
| main.(*testRunner).runTest.func2
| main/pkg/cmd/roachtest/test_runner.go:928
Wraps: (2) monitor failure
Wraps: (3) attached stack trace
-- stack trace:
| main.(*monitorImpl).wait.func2
| main/pkg/cmd/roachtest/monitor.go:171
Wraps: (4) monitor task failed
Wraps: (5) attached stack trace
-- stack trace:
| main.init
| main/pkg/cmd/roachtest/monitor.go:80
| runtime.doInit
| GOROOT/src/runtime/proc.go:6340
| runtime.main
| GOROOT/src/runtime/proc.go:233
| runtime.goexit
| GOROOT/src/runtime/asm_amd64.s:1594
Wraps: (6) t.Fatal() was called
Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.withPrefix (5) *withstack.withStack (6) *errutil.leafError
test_runner.go:1059,test_runner.go:958: test timed out (2h0m0s)
Parameters: ROACHTEST_cloud=gce
, ROACHTEST_cpu=4
, ROACHTEST_encrypted=false
, ROACHTEST_ssd=0
See: [roachtest README](https://github.com/cockroachdb/cockroach/blob/master/pkg/cmd/roachtest/README.md) See: [How To Investigate \(internal\)](https://cockroachlabs.atlassian.net/l/c/SSSBr8c7)
This looks similar to https://github.com/cockroachdb/cockroach/issues/88678. We should fold the initial investigation into that one.
roachtest.kv/splits/nodes=3/quiesce=true failed with artifacts on master @ ff13325e9368c4e8dd9a4d5cf4aa2ad2f33e9ac0:
| r12 0x7f714c1fdc48
| r13 0x3
| r14 0xc000602340
| r15 0x7f717881b5c0
| rip 0x49a101
| rflags 0x286
| cs 0x33
| fs 0x0
| gs 0x0
|
| stdout:
Wraps: (4) SSH_PROBLEM
Wraps: (5) Node 4. Command with error:
| ``````
| ./workload run kv --init --max-ops=1 --concurrency=192 --splits=300000 {pgurl:1-3}
| ``````
Wraps: (6) exit status 255
Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *cluster.WithCommandDetails (4) errors.SSH (5) *hintdetail.withDetail (6) *exec.ExitError
monitor.go:127,kv.go:729,test_runner.go:928: monitor failure: monitor task failed: t.Fatal() was called
(1) attached stack trace
-- stack trace:
| main.(*monitorImpl).WaitE
| main/pkg/cmd/roachtest/monitor.go:115
| main.(*monitorImpl).Wait
| main/pkg/cmd/roachtest/monitor.go:123
| github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerKVSplits.func1
| github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/kv.go:729
| main.(*testRunner).runTest.func2
| main/pkg/cmd/roachtest/test_runner.go:928
Wraps: (2) monitor failure
Wraps: (3) attached stack trace
-- stack trace:
| main.(*monitorImpl).wait.func2
| main/pkg/cmd/roachtest/monitor.go:171
Wraps: (4) monitor task failed
Wraps: (5) attached stack trace
-- stack trace:
| main.init
| main/pkg/cmd/roachtest/monitor.go:80
| runtime.doInit
| GOROOT/src/runtime/proc.go:6340
| runtime.main
| GOROOT/src/runtime/proc.go:233
| runtime.goexit
| GOROOT/src/runtime/asm_amd64.s:1594
Wraps: (6) t.Fatal() was called
Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.withPrefix (5) *withstack.withStack (6) *errutil.leafError
test_runner.go:1059,test_runner.go:958: test timed out (2h0m0s)
Parameters: ROACHTEST_cloud=gce
, ROACHTEST_cpu=4
, ROACHTEST_encrypted=false
, ROACHTEST_ssd=0
See: [roachtest README](https://github.com/cockroachdb/cockroach/blob/master/pkg/cmd/roachtest/README.md) See: [How To Investigate \(internal\)](https://cockroachlabs.atlassian.net/l/c/SSSBr8c7)
5/5 failures on master (beb40b52b380bf4ee15349445dbc674acafa5046) when stressing this test:
GCE_PROJECT=andrei-jepsen ./pkg/cmd/roachtest/roachstress.sh -c5 'kv/splits/nodes=3/quiesce=true$$' -- --cpu-quota=1000
I got 4/5 failures on master
(8107342458), but only 1/5 when reverting gRPC to 1.46. Doing another set to confirm.
Reverting gRPC to 1.46.0 in #88745 and #88749.
Second set had 3/5 failures on master
, 1/5 with gRPC reverted. Failure modes are also different (master
has tripped replica circuit breakers, gRPC revert didn't).
The gRPC reverts should address the proximate cause here, we should look into the other failures separately.
roachtest.kv/splits/nodes=3/quiesce=true failed with artifacts on master @ a0bfa6dafcc206301d3a21887c374db63b377075:
| r12 0x425
| r13 0x3
| r14 0xc000102b60
| r15 0x1
| rip 0x49a101
| rflags 0x286
| cs 0x33
| fs 0x0
| gs 0x0
|
| stdout:
Wraps: (4) SSH_PROBLEM
Wraps: (5) Node 4. Command with error:
| ``````
| ./workload run kv --init --max-ops=1 --concurrency=192 --splits=300000 {pgurl:1-3}
| ``````
Wraps: (6) exit status 255
Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *cluster.WithCommandDetails (4) errors.SSH (5) *hintdetail.withDetail (6) *exec.ExitError
monitor.go:127,kv.go:729,test_runner.go:928: monitor failure: monitor task failed: t.Fatal() was called
(1) attached stack trace
-- stack trace:
| main.(*monitorImpl).WaitE
| main/pkg/cmd/roachtest/monitor.go:115
| main.(*monitorImpl).Wait
| main/pkg/cmd/roachtest/monitor.go:123
| github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerKVSplits.func1
| github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/kv.go:729
| main.(*testRunner).runTest.func2
| main/pkg/cmd/roachtest/test_runner.go:928
Wraps: (2) monitor failure
Wraps: (3) attached stack trace
-- stack trace:
| main.(*monitorImpl).wait.func2
| main/pkg/cmd/roachtest/monitor.go:171
Wraps: (4) monitor task failed
Wraps: (5) attached stack trace
-- stack trace:
| main.init
| main/pkg/cmd/roachtest/monitor.go:80
| runtime.doInit
| GOROOT/src/runtime/proc.go:6340
| runtime.main
| GOROOT/src/runtime/proc.go:233
| runtime.goexit
| GOROOT/src/runtime/asm_amd64.s:1594
Wraps: (6) t.Fatal() was called
Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.withPrefix (5) *withstack.withStack (6) *errutil.leafError
test_runner.go:1059,test_runner.go:958: test timed out (2h0m0s)
Parameters: ROACHTEST_cloud=gce
, ROACHTEST_cpu=4
, ROACHTEST_encrypted=false
, ROACHTEST_ssd=0
See: [roachtest README](https://github.com/cockroachdb/cockroach/blob/master/pkg/cmd/roachtest/README.md) See: [How To Investigate \(internal\)](https://cockroachlabs.atlassian.net/l/c/SSSBr8c7)
roachtest.kv/splits/nodes=3/quiesce=true failed with artifacts on master @ 89f4ad907a1756551bd6864c3e8516eeff6b0e0a:
Parameters:
ROACHTEST_cloud=gce
,ROACHTEST_cpu=4
,ROACHTEST_encrypted=false
,ROACHTEST_ssd=0
Help
See: [roachtest README](https://github.com/cockroachdb/cockroach/blob/master/pkg/cmd/roachtest/README.md) See: [How To Investigate \(internal\)](https://cockroachlabs.atlassian.net/l/c/SSSBr8c7)
/cc @cockroachdb/kv-triage
This test on roachdash | Improve this report!
Jira issue: CRDB-19868