Closed cockroach-teamcity closed 1 year ago
roachtest.cdc/sink-chaos failed with artifacts on master @ 5fbcd8a8deac0205c7df38e340c1eb9692854383:
test artifacts and logs in: /artifacts/cdc/sink-chaos/run_1
(cluster.go:1937).Run: output in run_102050.580219051_n4_workload-run-tpcc-wa: ./workload run tpcc --warehouses=100 --duration=30m {pgurl:1-3} returned: COMMAND_PROBLEM: ssh verbose log retained in ssh_102051.370128293_n4_workload-run-tpcc-wa.log: exit status 1
(cdc.go:283).Close: error shutting down prometheus/grafana: context canceled
Parameters: ROACHTEST_cloud=gce
, ROACHTEST_cpu=16
, ROACHTEST_encrypted=false
, ROACHTEST_ssd=0
See: [roachtest README](https://github.com/cockroachdb/cockroach/blob/master/pkg/cmd/roachtest/README.md) See: [How To Investigate \(internal\)](https://cockroachlabs.atlassian.net/l/c/SSSBr8c7)
roachtest.cdc/sink-chaos failed with artifacts on master @ 8e24570fa366ed038c6ae65f50db5d8e22826db0:
test artifacts and logs in: /artifacts/cdc/sink-chaos/run_1
(cluster.go:1937).Run: output in run_101856.333108523_n4_workload-run-tpcc-wa: ./workload run tpcc --warehouses=100 --duration=30m {pgurl:1-3} returned: COMMAND_PROBLEM: ssh verbose log retained in ssh_101857.122870005_n4_workload-run-tpcc-wa.log: exit status 1
(cdc.go:283).Close: error shutting down prometheus/grafana: context canceled
Parameters: ROACHTEST_cloud=gce
, ROACHTEST_cpu=16
, ROACHTEST_encrypted=false
, ROACHTEST_fs=ext4
, ROACHTEST_localSSD=true
, ROACHTEST_ssd=0
See: [roachtest README](https://github.com/cockroachdb/cockroach/blob/master/pkg/cmd/roachtest/README.md) See: [How To Investigate \(internal\)](https://cockroachlabs.atlassian.net/l/c/SSSBr8c7)
roachtest.cdc/sink-chaos failed with artifacts on master @ eb158026c50d8fa856e42f928d844831ea9e6b28:
test artifacts and logs in: /artifacts/cdc/sink-chaos/run_1
(cluster.go:1937).Run: output in run_102342.926823441_n4_workload-run-tpcc-wa: ./workload run tpcc --warehouses=100 --duration=30m {pgurl:1-3} returned: COMMAND_PROBLEM: ssh verbose log retained in ssh_102343.724324591_n4_workload-run-tpcc-wa.log: exit status 1
(cdc.go:283).Close: error shutting down prometheus/grafana: context canceled
Parameters: ROACHTEST_cloud=gce
, ROACHTEST_cpu=16
, ROACHTEST_encrypted=false
, ROACHTEST_ssd=0
See: [roachtest README](https://github.com/cockroachdb/cockroach/blob/master/pkg/cmd/roachtest/README.md) See: [How To Investigate \(internal\)](https://cockroachlabs.atlassian.net/l/c/SSSBr8c7)
roachtest.cdc/sink-chaos failed with artifacts on master @ e51ffa013c81212870891001f0328912550fa75d:
test artifacts and logs in: /artifacts/cdc/sink-chaos/run_1
(cluster.go:1937).Run: output in run_103131.063502119_n4_workload-run-tpcc-wa: ./workload run tpcc --warehouses=100 --duration=30m {pgurl:1-3} returned: context canceled
(cdc.go:283).Close: error shutting down prometheus/grafana: context canceled
Parameters: ROACHTEST_cloud=gce
, ROACHTEST_cpu=16
, ROACHTEST_encrypted=false
, ROACHTEST_ssd=0
See: [roachtest README](https://github.com/cockroachdb/cockroach/blob/master/pkg/cmd/roachtest/README.md) See: [How To Investigate \(internal\)](https://cockroachlabs.atlassian.net/l/c/SSSBr8c7)
roachtest.cdc/sink-chaos failed with artifacts on master @ 2a7edbeb0737b1309064c25c641a309c2980d9ba:
test artifacts and logs in: /artifacts/cdc/sink-chaos/run_1
(cluster.go:1937).Run: output in run_100941.831818606_n4_workload-run-tpcc-wa: ./workload run tpcc --warehouses=100 --duration=30m {pgurl:1-3} returned: context canceled
(cdc.go:283).Close: error shutting down prometheus/grafana: context canceled
Parameters: ROACHTEST_cloud=gce
, ROACHTEST_cpu=16
, ROACHTEST_encrypted=false
, ROACHTEST_ssd=0
See: [roachtest README](https://github.com/cockroachdb/cockroach/blob/master/pkg/cmd/roachtest/README.md) See: [How To Investigate \(internal\)](https://cockroachlabs.atlassian.net/l/c/SSSBr8c7)
roachtest.cdc/sink-chaos failed with artifacts on master @ 31365e21dc606cdc1e4302c86192ffc5a6cf1255:
test artifacts and logs in: /artifacts/cdc/sink-chaos/run_1
(cluster.go:1937).Run: output in run_101924.591387929_n4_workload-run-tpcc-wa: ./workload run tpcc --warehouses=100 --duration=30m {pgurl:1-3} returned: context canceled
(cdc.go:283).Close: error shutting down prometheus/grafana: context canceled
Parameters: ROACHTEST_cloud=gce
, ROACHTEST_cpu=16
, ROACHTEST_encrypted=false
, ROACHTEST_ssd=0
See: [roachtest README](https://github.com/cockroachdb/cockroach/blob/master/pkg/cmd/roachtest/README.md) See: [How To Investigate \(internal\)](https://cockroachlabs.atlassian.net/l/c/SSSBr8c7)
roachtest.cdc/sink-chaos failed with artifacts on master @ 7e2df35a2f6bf7a859bb0539c8ca43c4e72ed260:
test artifacts and logs in: /artifacts/cdc/sink-chaos/run_1
(cluster.go:1940).Run: output in run_103323.114592951_n4_workload-run-tpcc-wa: ./workload run tpcc --warehouses=100 --duration=30m {pgurl:1-3} returned: context canceled
(cdc.go:283).Close: error shutting down prometheus/grafana: context canceled
Parameters: ROACHTEST_cloud=gce
, ROACHTEST_cpu=16
, ROACHTEST_encrypted=false
, ROACHTEST_ssd=0
See: [roachtest README](https://github.com/cockroachdb/cockroach/blob/master/pkg/cmd/roachtest/README.md) See: [How To Investigate \(internal\)](https://cockroachlabs.atlassian.net/l/c/SSSBr8c7)
roachtest.cdc/sink-chaos failed with artifacts on master @ c95bef097bd4c213c6b5c0c125a9a846c4479d73:
test artifacts and logs in: /artifacts/cdc/sink-chaos/run_1
(cluster.go:1940).Run: output in run_103906.927230883_n4_workload-run-tpcc-wa: ./workload run tpcc --warehouses=100 --duration=30m {pgurl:1-3} returned: COMMAND_PROBLEM: ssh verbose log retained in ssh_103907.684738529_n4_workload-run-tpcc-wa.log: exit status 1
(cdc.go:283).Close: error shutting down prometheus/grafana: context canceled
Parameters: ROACHTEST_cloud=gce
, ROACHTEST_cpu=16
, ROACHTEST_encrypted=false
, ROACHTEST_ssd=0
See: [roachtest README](https://github.com/cockroachdb/cockroach/blob/master/pkg/cmd/roachtest/README.md) See: [How To Investigate \(internal\)](https://cockroachlabs.atlassian.net/l/c/SSSBr8c7)
roachtest.cdc/sink-chaos failed with artifacts on master @ 3d054f37c7c87f53cb56fac4e5500f0d1130d09a:
test artifacts and logs in: /artifacts/cdc/sink-chaos/run_1
(cluster.go:1940).Run: output in run_102531.296808624_n4_workload-run-tpcc-wa: ./workload run tpcc --warehouses=100 --duration=30m {pgurl:1-3} returned: COMMAND_PROBLEM: ssh verbose log retained in ssh_102532.100190027_n4_workload-run-tpcc-wa.log: exit status 1
(cdc.go:283).Close: error shutting down prometheus/grafana: context canceled
Parameters: ROACHTEST_cloud=gce
, ROACHTEST_cpu=16
, ROACHTEST_encrypted=false
, ROACHTEST_ssd=0
See: [roachtest README](https://github.com/cockroachdb/cockroach/blob/master/pkg/cmd/roachtest/README.md) See: [How To Investigate \(internal\)](https://cockroachlabs.atlassian.net/l/c/SSSBr8c7)
roachtest.cdc/sink-chaos failed with artifacts on master @ e9c96e7179e19aae2f8d386f67eb950db8c3354b:
test artifacts and logs in: /artifacts/cdc/sink-chaos/run_1
(cluster.go:1940).Run: output in run_103203.909948525_n4_workload-run-tpcc-wa: ./workload run tpcc --warehouses=100 --duration=30m {pgurl:1-3} returned: COMMAND_PROBLEM: ssh verbose log retained in ssh_103204.640858670_n4_workload-run-tpcc-wa.log: exit status 1
(cdc.go:283).Close: error shutting down prometheus/grafana: context canceled
Parameters: ROACHTEST_cloud=gce
, ROACHTEST_cpu=16
, ROACHTEST_encrypted=false
, ROACHTEST_fs=ext4
, ROACHTEST_localSSD=true
, ROACHTEST_ssd=0
See: [roachtest README](https://github.com/cockroachdb/cockroach/blob/master/pkg/cmd/roachtest/README.md) See: [How To Investigate \(internal\)](https://cockroachlabs.atlassian.net/l/c/SSSBr8c7)
@samiskin any updates on this issue?
roachtest.cdc/sink-chaos failed with artifacts on master @ 286b3e235171a39b8f9910555affcc7ce310741a:
test artifacts and logs in: /artifacts/cdc/sink-chaos/run_1
(cluster.go:1956).Run: output in run_102934.007520384_n4_workload-run-tpcc-wa: ./workload run tpcc --warehouses=100 --duration=30m {pgurl:1-3} returned: COMMAND_PROBLEM: ssh verbose log retained in ssh_102934.755935866_n4_workload-run-tpcc-wa.log: exit status 1
(cdc.go:283).Close: error shutting down prometheus/grafana: context canceled
Parameters: ROACHTEST_cloud=gce
, ROACHTEST_cpu=16
, ROACHTEST_encrypted=false
, ROACHTEST_fs=ext4
, ROACHTEST_localSSD=true
, ROACHTEST_ssd=0
See: [roachtest README](https://github.com/cockroachdb/cockroach/blob/master/pkg/cmd/roachtest/README.md) See: [How To Investigate \(internal\)](https://cockroachlabs.atlassian.net/l/c/SSSBr8c7)
Latest 3 failures show a problem while running TPCC
| | I230222 10:40:35.105029 1786 workload/pgx_helpers.go:79 [T1] 4 pgx logger [error]: Query logParams=map[args:[25 1 2113] err:ERROR: rpc error: code = Unavailable desc = error reading from server: read tcp 10.142.1.2:59786->10.142.1.4:26257: use of closed network connection (SQLSTATE XXUUU) pid:2383385 sql:
| | I230222 10:40:35.105029 1786 workload/pgx_helpers.go:79 [T1] 4 + SELECT sum(ol_amount) FROM order_line
| | I230222 10:40:35.105029 1786 workload/pgx_helpers.go:79 [T1] 4 + WHERE ol_w_id = $1 AND ol_d_id = $2 AND ol_o_id = $3]
| | Error: error in delivery: ERROR: rpc error: code = Unavailable desc = error reading from server: read tcp 10.142.1.2:59786->10.142.1.4:26257: use of closed network connection (SQLSTATE XXUUU)
This is from failure_1.log
Perhaps the node crashed? It started happening ~3 weeks ago, and keeps happening consistently. I don't think it's a one off issue; and we have this as a release blocker.
Finally found it. Node 3 panicked: https://teamcity.cockroachdb.com/repository/download/Cockroach_Nightlies_RoachtestNightlyGceBazel/8785686:id/cdc/sink-chaos/run_1/artifacts.zip!/logs/3.unredacted/cockroach-stderr.log
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x3b6adc9]
goroutine 242783 [running]:
panic({0x5002fc0, 0x9ce4030})
GOROOT/src/runtime/panic.go:987 +0x3ba fp=0xc00d13be20 sp=0xc00d13bd60 pc=0x49dd5a
runtime.panicmem(...)
GOROOT/src/runtime/panic.go:260
runtime.sigpanic()
GOROOT/src/runtime/signal_unix.go:835 +0x2f6 fp=0xc00d13be70 sp=0xc00d13be20 pc=0x4b4c16
github.com/Shopify/sarama.(*partitionProducer).newHighWatermark(0xc009b62de0, 0x1)
github.com/Shopify/sarama/external/com_github_shopify_sarama/async_producer.go:620 +0x1a9 fp=0xc00d13bed0 sp=0xc00d13be70 pc=0x3b6adc9
github.com/Shopify/sarama.(*partitionProducer).dispatch(0xc009b62de0)
github.com/Shopify/sarama/external/com_github_shopify_sarama/async_producer.go:564 +0x537 fp=0xc00d13bf90 sp=0xc00d13bed0 pc=0x3b6a937
github.com/Shopify/sarama.(*partitionProducer).dispatch-fm()
<autogenerated>:1 +0x26 fp=0xc00d13bfa8 sp=0xc00d13bf90 pc=0x3bbca26
github.com/Shopify/sarama.withRecover(0x0?)
github.com/Shopify/sarama/external/com_github_shopify_sarama/utils.go:43 +0x3e fp=0xc00d13bfc8 sp=0xc00d13bfa8 pc=0x3bb6f9e
github.com/Shopify/sarama.(*asyncProducer).newPartitionProducer.func1()
github.com/Shopify/sarama/external/com_github_shopify_sarama/async_producer.go:515 +0x26 fp=0xc00d13bfe0 sp=0xc00d13bfc8 pc=0x3b6a346
runtime.goexit()
GOROOT/src/runtime/asm_amd64.s:1594 +0x1 fp=0xc00d13bfe8 sp=0xc00d13bfe0 pc=0x4d2a41
created by github.com/Shopify/sarama.(*asyncProducer).newPartitionProducer
github.com/Shopify/sarama/external/com_github_shopify_sarama/async_producer.go:515 +0x1ea
roachtest.cdc/sink-chaos failed with artifacts on master @ e028ce5b14505dfd17ef8b13001c0ab8ac811e3c:
test artifacts and logs in: /artifacts/cdc/sink-chaos/run_1
(cluster.go:1956).Run: output in run_101206.687098033_n4_workload-run-tpcc-wa: ./workload run tpcc --warehouses=100 --duration=30m {pgurl:1-3} returned: COMMAND_PROBLEM: ssh verbose log retained in ssh_101207.492156179_n4_workload-run-tpcc-wa.log: exit status 1
(cdc.go:283).Close: error shutting down prometheus/grafana: context canceled
Parameters: ROACHTEST_cloud=gce
, ROACHTEST_cpu=16
, ROACHTEST_encrypted=false
, ROACHTEST_ssd=0
See: [roachtest README](https://github.com/cockroachdb/cockroach/blob/master/pkg/cmd/roachtest/README.md) See: [How To Investigate \(internal\)](https://cockroachlabs.atlassian.net/l/c/SSSBr8c7)
roachtest.cdc/sink-chaos failed with artifacts on master @ 0d3393b0623a5c258b25725f64f3689e2f54667b:
test artifacts and logs in: /artifacts/cdc/sink-chaos/run_1
(cluster.go:1956).Run: output in run_100636.525023948_n4_workload-run-tpcc-wa: ./workload run tpcc --warehouses=100 --duration=30m {pgurl:1-3} returned: COMMAND_PROBLEM: ssh verbose log retained in ssh_100637.266464476_n4_workload-run-tpcc-wa.log: exit status 1
(cdc.go:283).Close: error shutting down prometheus/grafana: context canceled
Parameters: ROACHTEST_cloud=gce
, ROACHTEST_cpu=16
, ROACHTEST_encrypted=false
, ROACHTEST_fs=ext4
, ROACHTEST_localSSD=true
, ROACHTEST_ssd=0
See: [roachtest README](https://github.com/cockroachdb/cockroach/blob/master/pkg/cmd/roachtest/README.md) See: [How To Investigate \(internal\)](https://cockroachlabs.atlassian.net/l/c/SSSBr8c7)
roachtest.cdc/sink-chaos failed with artifacts on master @ 39c06b5a438c01c93ffbfeeefe702d3f9b620eaf:
test artifacts and logs in: /artifacts/cdc/sink-chaos/run_1
(cluster.go:1956).Run: output in run_100937.610495214_n4_workload-run-tpcc-wa: ./workload run tpcc --warehouses=100 --duration=30m {pgurl:1-3} returned: COMMAND_PROBLEM: ssh verbose log retained in ssh_100938.380837809_n4_workload-run-tpcc-wa.log: exit status 1
(cdc.go:283).Close: error shutting down prometheus/grafana: context canceled
Parameters: ROACHTEST_cloud=gce
, ROACHTEST_cpu=16
, ROACHTEST_encrypted=false
, ROACHTEST_fs=ext4
, ROACHTEST_localSSD=true
, ROACHTEST_ssd=0
See: [roachtest README](https://github.com/cockroachdb/cockroach/blob/master/pkg/cmd/roachtest/README.md) See: [How To Investigate \(internal\)](https://cockroachlabs.atlassian.net/l/c/SSSBr8c7)
roachtest.cdc/sink-chaos failed with artifacts on master @ 13c58f621519794e775b7cfc4d8b557bc99eeca0:
test artifacts and logs in: /artifacts/cdc/sink-chaos/run_1
(monitor.go:127).Wait: monitor failure: monitor command failure: unexpected node event: 3: dead (exit status 134)
Parameters: ROACHTEST_cloud=gce
, ROACHTEST_cpu=16
, ROACHTEST_encrypted=false
, ROACHTEST_fs=ext4
, ROACHTEST_localSSD=true
, ROACHTEST_ssd=0
See: [roachtest README](https://github.com/cockroachdb/cockroach/blob/master/pkg/cmd/roachtest/README.md) See: [How To Investigate \(internal\)](https://cockroachlabs.atlassian.net/l/c/SSSBr8c7)
roachtest.cdc/sink-chaos failed with artifacts on master @ b0e5507f74c07e13cfda8cda8b9079b457a9f37d:
test artifacts and logs in: /artifacts/cdc/sink-chaos/run_1
(cluster.go:1956).Run: output in run_101305.020474857_n4_workload-run-tpcc-wa: ./workload run tpcc --warehouses=100 --duration=30m {pgurl:1-3} returned: COMMAND_PROBLEM: ssh verbose log retained in ssh_101305.764062036_n4_workload-run-tpcc-wa.log: exit status 1
(cdc.go:283).Close: error shutting down prometheus/grafana: context canceled
Parameters: ROACHTEST_cloud=gce
, ROACHTEST_cpu=16
, ROACHTEST_encrypted=false
, ROACHTEST_ssd=0
See: [roachtest README](https://github.com/cockroachdb/cockroach/blob/master/pkg/cmd/roachtest/README.md) See: [How To Investigate \(internal\)](https://cockroachlabs.atlassian.net/l/c/SSSBr8c7)
roachtest.cdc/sink-chaos failed with artifacts on master @ 21786aa112e6b822858f281c1cc59608987c5c0a:
test artifacts and logs in: /artifacts/cdc/sink-chaos/run_1
(cluster.go:1956).Run: output in run_101708.818500757_n4_workload-run-tpcc-wa: ./workload run tpcc --warehouses=100 --duration=30m {pgurl:1-3} returned: COMMAND_PROBLEM: ssh verbose log retained in ssh_101709.557595290_n4_workload-run-tpcc-wa.log: exit status 1
(cdc.go:283).Close: error shutting down prometheus/grafana: context canceled
Parameters: ROACHTEST_cloud=gce
, ROACHTEST_cpu=16
, ROACHTEST_encrypted=false
, ROACHTEST_fs=ext4
, ROACHTEST_localSSD=true
, ROACHTEST_ssd=0
See: [roachtest README](https://github.com/cockroachdb/cockroach/blob/master/pkg/cmd/roachtest/README.md) See: [How To Investigate \(internal\)](https://cockroachlabs.atlassian.net/l/c/SSSBr8c7)
roachtest.cdc/sink-chaos failed with artifacts on master @ 22244a780dcfaca48162dde8e0f90b5ba9b6bb9c:
Parameters:
ROACHTEST_cloud=gce
,ROACHTEST_cpu=16
,ROACHTEST_encrypted=false
,ROACHTEST_fs=ext4
,ROACHTEST_localSSD=true
,ROACHTEST_ssd=0
Help
See: [roachtest README](https://github.com/cockroachdb/cockroach/blob/master/pkg/cmd/roachtest/README.md) See: [How To Investigate \(internal\)](https://cockroachlabs.atlassian.net/l/c/SSSBr8c7)
/cc @cockroachdb/cdc
This test on roachdash | Improve this report!
Jira issue: CRDB-24114
Epic CRDB-11732