cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.13k stars 3.81k forks source link

roachtest: cdc/workload/kv100/nodes=5/cpu=16/ranges=100k/server=scheduler/protocol=mux/format=json/sink=kafka failed #132286

Closed cockroach-teamcity closed 4 weeks ago

cockroach-teamcity commented 1 month ago

roachtest.cdc/workload/kv100/nodes=5/cpu=16/ranges=100k/server=scheduler/protocol=mux/format=json/sink=kafka failed with artifacts on master @ fd4b1464dbd6e385c6e51af26fe294fd2023a259:

(cluster.go:2478).Run: full command output in run_073619.182501247_n6_cockroach-workload-i.log: COMMAND_PROBLEM: exit status 1
(cluster.go:2478).Run: context canceled
test artifacts and logs in: /artifacts/cdc/workload/kv100/nodes=5/cpu=16/ranges=100k/server=scheduler/protocol=mux/format=json/sink=kafka/run_1

Parameters:

See: roachtest README

See: How To Investigate (internal)

Grafana is not yet available for azure clusters

/cc @cockroachdb/cdc

This test on roachdash | Improve this report!

Jira issue: CRDB-42922

cockroach-teamcity commented 1 month ago

roachtest.cdc/workload/kv100/nodes=5/cpu=16/ranges=100k/server=scheduler/protocol=mux/format=json/sink=kafka failed with artifacts on master @ 645eb8c99796b3b88f5631aa0fc92a011010ce64:

(cluster.go:2449).Run: full command output in run_070138.546997098_n6_cockroach-workload-i.log: COMMAND_PROBLEM: exit status 1
(cluster.go:2449).Run: context canceled
test artifacts and logs in: /artifacts/cdc/workload/kv100/nodes=5/cpu=16/ranges=100k/server=scheduler/protocol=mux/format=json/sink=kafka/run_1

Parameters:

See: roachtest README

See: How To Investigate (internal)

Grafana is not yet available for azure clusters

This test on roachdash | Improve this report!

cockroach-teamcity commented 1 month ago

roachtest.cdc/workload/kv100/nodes=5/cpu=16/ranges=100k/server=scheduler/protocol=mux/format=json/sink=kafka failed with artifacts on master @ 645eb8c99796b3b88f5631aa0fc92a011010ce64:

(test_runner.go:1308).runTest: test timed out (1h0m0s)
test artifacts and logs in: /artifacts/cdc/workload/kv100/nodes=5/cpu=16/ranges=100k/server=scheduler/protocol=mux/format=json/sink=kafka/run_1

Parameters:

See: roachtest README

See: How To Investigate (internal)

See: Grafana

This test on roachdash | Improve this report!

wenyihu6 commented 1 month ago

Similar signs as https://github.com/cockroachdb/cockroach/issues/132413. Splits didn't finish, slow span latches, slow replica rpc, slow heartbeats.

Error: executing ALTER TABLE kv SPLIT AT VALUES (-3371016535584718524): pq: replica unavailable: (n2,s2):4 unable to serve request to r13016:/Table/106/1/-337{1201001180799658-0832069988637390} [(n2,s2):4, (n4,s4):2, (n3,s3):3, next=5, gen=74, sticky=9223372036.854775807,2147483647]: closed timestamp: 1728547142.649811772,0 (2024-10-10 07:59:02); raft status: {"id":"4","term":20,"vote":"4","commit":48,"lead":"2","raftState":"StateFollower","applied":48,"progress":{},"leadtransferee":"0"}: have been waiting 62.00s for slow proposal HeartbeatTxn [/Local/Range/Table/106/1/-3371201001180799658/RangeDescriptor], [txn: 8071debd]

W241010 07:59:05.774753 2296238 kv/kvclient/kvcoord/dist_sender.go:2746 ⋮ [T1,Vsystem,n1,s1] 40512  slow replica RPC: have been waiting 16.38s (2 attempts) for RPC GC [/Table/106/1/‹-7487366312135223734›,/Table/106/1/‹-7487366312135223734›/‹NULL›), GC [/Table/106/1/‹-7487366312135223734›,/Table/106/1/‹-7487366312135223734›/‹NULL›) to replica (n3,s3):4VOTER_INCOMING; resp: ‹(err: <nil>), *kvpb.GCResponse, *kvpb.GCResponse›
W241010 07:59:05.774793 2297319 kv/kvclient/kvcoord/dist_sender.go:2746 ⋮ [T1,Vsystem,n1,txn-hb=d744e10c] 40513  slow replica RPC: have been waiting 15.34s (0 attempts) for RPC HeartbeatTxn [/Local/Range/Table/106/1/‹-7487366312135223734›/‹RangeDescriptor›], [txn: d744e10c] to replica (n3,s3):4VOTER_INCOMING; resp: ‹(err: <nil>), *kvpb.HeartbeatTxnResponse›
cockroach-teamcity commented 1 month ago

roachtest.cdc/workload/kv100/nodes=5/cpu=16/ranges=100k/server=scheduler/protocol=mux/format=json/sink=kafka failed with artifacts on master @ 30dbb173d0f083b35cf9eb8093832a5dd764c5af:

(test_runner.go:1308).runTest: test timed out (1h0m0s)
test artifacts and logs in: /artifacts/cdc/workload/kv100/nodes=5/cpu=16/ranges=100k/server=scheduler/protocol=mux/format=json/sink=kafka/cpu_arch=arm64/run_1

Parameters:

See: roachtest README

See: How To Investigate (internal)

Grafana is not yet available for azure clusters

This test on roachdash | Improve this report!

cockroach-teamcity commented 1 month ago

roachtest.cdc/workload/kv100/nodes=5/cpu=16/ranges=100k/server=scheduler/protocol=mux/format=json/sink=kafka failed with artifacts on master @ 30dbb173d0f083b35cf9eb8093832a5dd764c5af:

(test_runner.go:1308).runTest: test timed out (1h0m0s)
test artifacts and logs in: /artifacts/cdc/workload/kv100/nodes=5/cpu=16/ranges=100k/server=scheduler/protocol=mux/format=json/sink=kafka/run_1

Parameters:

See: roachtest README

See: How To Investigate (internal)

See: Grafana

This test on roachdash | Improve this report!

cockroach-teamcity commented 1 month ago

roachtest.cdc/workload/kv100/nodes=5/cpu=16/ranges=100k/server=scheduler/protocol=mux/format=json/sink=kafka failed with artifacts on master @ 30dbb173d0f083b35cf9eb8093832a5dd764c5af:

(test_runner.go:1308).runTest: test timed out (1h0m0s)
test artifacts and logs in: /artifacts/cdc/workload/kv100/nodes=5/cpu=16/ranges=100k/server=scheduler/protocol=mux/format=json/sink=kafka/run_1

Parameters:

See: roachtest README

See: How To Investigate (internal)

Grafana is not yet available for azure clusters

This test on roachdash | Improve this report!

cockroach-teamcity commented 1 month ago

roachtest.cdc/workload/kv100/nodes=5/cpu=16/ranges=100k/server=scheduler/protocol=mux/format=json/sink=kafka failed with artifacts on master @ 30dbb173d0f083b35cf9eb8093832a5dd764c5af:

(test_runner.go:1308).runTest: test timed out (1h0m0s)
test artifacts and logs in: /artifacts/cdc/workload/kv100/nodes=5/cpu=16/ranges=100k/server=scheduler/protocol=mux/format=json/sink=kafka/run_1

Parameters:

See: roachtest README

See: How To Investigate (internal)

See: Grafana

This test on roachdash | Improve this report!

cockroach-teamcity commented 1 month ago

roachtest.cdc/workload/kv100/nodes=5/cpu=16/ranges=100k/server=scheduler/protocol=mux/format=json/sink=kafka failed with artifacts on master @ a0f39e7ac9574756063bc90bba6bc532b45c33d4:

(cluster.go:2449).Run: full command output in run_074426.253521854_n6_cockroach-workload-i.log: COMMAND_PROBLEM: exit status 1
(cluster.go:2449).Run: context canceled
test artifacts and logs in: /artifacts/cdc/workload/kv100/nodes=5/cpu=16/ranges=100k/server=scheduler/protocol=mux/format=json/sink=kafka/run_1

Parameters:

See: roachtest README

See: How To Investigate (internal)

Grafana is not yet available for azure clusters

This test on roachdash | Improve this report!

cockroach-teamcity commented 1 month ago

roachtest.cdc/workload/kv100/nodes=5/cpu=16/ranges=100k/server=scheduler/protocol=mux/format=json/sink=kafka failed with artifacts on master @ a0f39e7ac9574756063bc90bba6bc532b45c33d4:

(cluster.go:2449).Run: full command output in run_091616.619538189_n6_cockroach-workload-i.log: COMMAND_PROBLEM: exit status 1
(cluster.go:2449).Run: context canceled
test artifacts and logs in: /artifacts/cdc/workload/kv100/nodes=5/cpu=16/ranges=100k/server=scheduler/protocol=mux/format=json/sink=kafka/run_1

Parameters:

See: roachtest README

See: How To Investigate (internal)

See: Grafana

This test on roachdash | Improve this report!

andrewbaptist commented 1 month ago

This is caused by the recent pebble bump: 124e6c86c10. Reassigning to storage.

cockroach-teamcity commented 1 month ago

roachtest.cdc/workload/kv100/nodes=5/cpu=16/ranges=100k/server=scheduler/protocol=mux/format=json/sink=kafka failed with artifacts on master @ 49ca24cedb042579e9645c206640d59975805d12:

(cluster.go:2449).Run: full command output in run_080852.338622848_n6_cockroach-workload-i.log: COMMAND_PROBLEM: exit status 1
(cluster.go:2449).Run: context canceled
test artifacts and logs in: /artifacts/cdc/workload/kv100/nodes=5/cpu=16/ranges=100k/server=scheduler/protocol=mux/format=json/sink=kafka/cpu_arch=arm64/run_1

Parameters:

See: roachtest README

See: How To Investigate (internal)

Grafana is not yet available for azure clusters

This test on roachdash | Improve this report!

cockroach-teamcity commented 4 weeks ago

roachtest.cdc/workload/kv100/nodes=5/cpu=16/ranges=100k/server=scheduler/protocol=mux/format=json/sink=kafka failed with artifacts on master @ 5be5b0b52ff79b98689b2282a8b25cf9eb50ec40:

(cluster.go:2449).Run: full command output in run_082850.958484666_n6_cockroach-workload-i.log: COMMAND_PROBLEM: exit status 1
(cluster.go:2449).Run: context canceled
test artifacts and logs in: /artifacts/cdc/workload/kv100/nodes=5/cpu=16/ranges=100k/server=scheduler/protocol=mux/format=json/sink=kafka/run_1

Parameters:

See: roachtest README

See: How To Investigate (internal)

Grafana is not yet available for azure clusters

This test on roachdash | Improve this report!

cockroach-teamcity commented 4 weeks ago

roachtest.cdc/workload/kv100/nodes=5/cpu=16/ranges=100k/server=scheduler/protocol=mux/format=json/sink=kafka failed with artifacts on master @ 5be5b0b52ff79b98689b2282a8b25cf9eb50ec40:

(test_runner.go:1308).runTest: test timed out (1h0m0s)
test artifacts and logs in: /artifacts/cdc/workload/kv100/nodes=5/cpu=16/ranges=100k/server=scheduler/protocol=mux/format=json/sink=kafka/run_1

Parameters:

See: roachtest README

See: How To Investigate (internal)

See: Grafana

This test on roachdash | Improve this report!