Closed cockroach-teamcity closed 1 year ago
roachtest.disagg-rebalance/aws/n4cpu4 failed with artifacts on master @ 4d045594e8c65b56c82fcf2a1f14ee30cecfef3d:
(monitor.go:153).Wait: monitor failure: full command output in run_053509.244780444_n1_cockroach-workload-f.log: COMMAND_PROBLEM: exit status 1
test artifacts and logs in: /artifacts/disagg-rebalance/aws/n4cpu4/run_1
Parameters: ROACHTEST_arch=amd64
, ROACHTEST_cloud=aws
, ROACHTEST_cpu=4
, ROACHTEST_encrypted=false
, ROACHTEST_fs=ext4
, ROACHTEST_localSSD=true
, ROACHTEST_metamorphicBuild=true
, ROACHTEST_ssd=0
See: [roachtest README](https://github.com/cockroachdb/cockroach/blob/master/pkg/cmd/roachtest/README.md) See: [How To Investigate \(internal\)](https://cockroachlabs.atlassian.net/l/c/SSSBr8c7) _Grafana is not yet available for aws clusters_
Failing because of an error during the import
E231104 05:37:30.213024 22028 kv/kvserver/replica_consistency.go:778 ⋮ [T1,Vsystem,n3,s3,r85/3:‹/Table/113/1/1{3/333…-4/653…}›] 211 checksum computation failed: pebble: shared foreign sstable has a lower table format than expected
A lot of snapshot ingestion is happening. Unrelated to the failure, we see a lot of compactions cancelled, presumably because of IngestAndExcise. Do we quantify how many bytes we read/wrote in compactions that got cancelled and include them in Metrics
(I didn't see anything with a cursory look at metrics.go)?
I231104 05:36:16.729082 15945 kv/kvserver/replica_raftstorage.go:579 ⋮ [T1,Vsystem,n3,s3,r48/3:‹/Table/4{5-6}›] 175 applied snapshot a01c6fd8 from (n1,s1):1 at applied index 20 (total=448ms data=693 B excise=true ingestion=6@257ms)
I231104 05:36:17.934059 16166 kv/kvserver/replica_raftstorage.go:579 ⋮ [T1,Vsystem,n3,s3,r11/3:‹/Table/{7-8}›] 176 applied snapshot ed3dc0ae from (n1,s1):1 at applied index 35 (total=600ms data=715 B shared=1 sharedSize=12 KiB excise=true ingestion=6@471ms)
E231104 05:36:17.987292 16203 3@pebble/event.go:696 ⋮ [n3,s3,pebble] 177 background error: pebble: compaction cancelled by a concurrent operation, will retry compaction
E231104 05:36:19.685225 16482 3@pebble/event.go:696 ⋮ [n3,s3,pebble] 178 background error: pebble: compaction cancelled by a concurrent operation, will retry compaction
I231104 05:36:19.740020 16337 kv/kvserver/replica_raftstorage.go:579 ⋮ [T1,Vsystem,n3,s3,r3/3:‹/System/{NodeLive…-tsd}›] 179 applied snapshot 475b7d93 from (n1,s1):1 at applied index 76 (total=628ms data=365 KiB excise=true ingestion=6@531ms)
I231104 05:36:21.013004 16620 kv/kvserver/replica_raftstorage.go:579 ⋮ [T1,Vsystem,n3,s3,r23/3:‹/Table/2{0-1}›] 180 applied snapshot 5722aac3 from (n1,s1):1 at applied index 577 (total=657ms data=37 KiB shared=2 sharedSize=33 KiB excise=true ingestion=6@539ms)
E231104 05:36:21.047639 16684 3@pebble/event.go:696 ⋮ [n3,s3,pebble] 181 background error: pebble: compaction cancelled by a concurrent operation, will retry compaction
I231104 05:36:22.255304 16778 kv/kvserver/replica_raftstorage.go:579 ⋮ [T1,Vsystem,n3,s3,r29/3:‹/Table/2{6-7}›] 182 applied snapshot 0c4fe160 from (n1,s1):1 at applied index 36 (total=650ms data=1.3 KiB excise=true ingestion=6@538ms)
I231104 05:36:22.886299 16946 kv/kvserver/replica_raftstorage.go:579 ⋮ [T1,Vsystem,n3,s3,r61/3:‹/Table/{59-60}›] 183 applied snapshot 19e6637e from (n1,s1):1 at applied index 20 (total=586ms data=693 B excise=true ingestion=6@308ms)
I231104 05:36:23.878976 17146 kv/kvserver/replica_raftstorage.go:579 ⋮ [T1,Vsystem,n3,s3,r55/3:‹/Table/5{3-4}›] 184 applied snapshot ca4918dd from (n1,s1):1 at applied index 1565 (total=536ms data=222 KiB shared=2 sharedSize=80 KiB excise=true ingestion=6@422ms)
I231104 05:36:25.539869 17408 kv/kvserver/replica_raftstorage.go:579 ⋮ [T1,Vsystem,n3,s3,r51/3:‹/Table/{48-50}›] 185 applied snapshot 15b07722 from (n1,s1):1 at applied index 20 (total=411ms data=705 B shared=1 sharedSize=14 KiB excise=true ingestion=6@279ms)
I231104 05:36:27.097738 17635 kv/kvserver/replica_raftstorage.go:579 ⋮ [T1,Vsystem,n3,s3,r6/3:‹/Table/{0-3}›] 186 applied snapshot d747d4f2 from (n1,s1):1 at applied index 20 (total=442ms data=693 B excise=true ingestion=6@364ms)
E231104 05:36:27.299577 17701 3@pebble/event.go:696 ⋮ [n3,s3,pebble] 187 background error: pebble: compaction cancelled by a concurrent operation, will retry compaction
It's the code in https://github.com/cockroachdb/cockroach/blob/4a53d0b7b014d58cb4498ffd8e50c031997a4020/pkg/storage/sst_writer.go#L75-L77 coupled with
initialized metamorphic constant "storage.value_blocks.enabled" with value false
roachtest.disagg-rebalance/aws/n4cpu4 failed with artifacts on master @ 694861a16c8d72a52ac059ef82cf2763ca4406b0:
Parameters:
ROACHTEST_arch=amd64
,ROACHTEST_cloud=aws
,ROACHTEST_cpu=4
,ROACHTEST_encrypted=false
,ROACHTEST_fs=ext4
,ROACHTEST_localSSD=true
,ROACHTEST_metamorphicBuild=true
,ROACHTEST_ssd=0
Help
See: [roachtest README](https://github.com/cockroachdb/cockroach/blob/master/pkg/cmd/roachtest/README.md) See: [How To Investigate \(internal\)](https://cockroachlabs.atlassian.net/l/c/SSSBr8c7) _Grafana is not yet available for aws clusters_
/cc @cockroachdb/storage
This test on roachdash | Improve this report!
Jira issue: CRDB-33126