Scylla Nodes used in this run:
No resources left at the end of the run
OS / Image: ami-0f6aebcffc8f7aa66 (aws: eu-west-1)
Test: scylla-cloud-longevity-cdc-100gb-4h
Test id: 31c49a40-cd31-4214-b469-54011c1aaaf0
Test name: siren-tests/longevity-tests/scylla-cloud-longevity-cdc-100gb-4h
Test config file(s):
By 2022-11-08 16:01:55.317 we started all the c-s threads.
Then we ran a series of nodetool cfstats and nodetool flush commands (no nemeses were started at that moment)
cfstats for audit keyspace (+ flush) - succeeded
< t:2022-11-08 16:02:06,231 f:cluster.py l:4099 c:sdcm.cluster p:DEBUG > <cluster_cloud.ScyllaCloudCluster object at 0x7fc8d6fb6ef0>: Get cfstats on the node longevity-cdc-100gb-4h-master-db-node-31c49a40-1 for audit keyspace
< t:2022-11-08 16:02:06,231 f:events_processes.py l:147 c:sdcm.sct_events.events_processes p:DEBUG > Get process `MainDevice' from EventsProcessesRegistry[lod_dir=/home/ubuntu/sct-results/20221108-150617-417068,id=0x7fc8d70747c0,default=True]
< t:2022-11-08 16:02:06,232 f:remote_base.py l:520 c:RemoteLibSSH2CmdRunner p:DEBUG > Running command "/usr/bin/nodetool -u scylla -pw '***' flush "...
< t:2022-11-08 16:02:06,233 f:file_logger.py l:101 c:sdcm.sct_events.file_logger p:INFO > 2022-11-08 16:02:06.231: (NodetoolEvent Severity.NORMAL) period_type=begin event_id=262ff7fc-358c-40f3-9258-fc25c0d788db: nodetool_command=flush node=longevity-cdc-100gb-4h-master-d
b-node-31c49a40-1
< t:2022-11-08 16:02:06,233 f:grafana.py l:80 c:sdcm.sct_events.grafana p:DEBUG > GrafanaEventAggregator start a new time window (90 sec)
< t:2022-11-08 16:02:08,275 f:base.py l:142 c:RemoteLibSSH2CmdRunner p:DEBUG > Command "/usr/bin/nodetool -u scylla -pw '***' flush " finished with status 0
< t:2022-11-08 16:02:08,276 f:cluster.py l:2567 c:sdcm.cluster_baremetal p:DEBUG > Node longevity-cdc-100gb-4h-master-db-node-31c49a40-1 [None | 172.23.40.169] (seed: True): Command '/usr/bin/nodetool -u scylla -pw '***' flush ' duration -> 2.0436472590008
634 s
< t:2022-11-08 16:02:12,556 f:base.py l:142 c:RemoteLibSSH2CmdRunner p:DEBUG > Command "/usr/bin/nodetool -u scylla -pw '***' cfstats audit" finished with status 0
Then cfstats for _cdctest keyspace (+ flush).
flush succeeded
Installation details
Kernel Version: 5.15.0-1020-aws Scylla version (or git commit hash):
2022.1.3-20220922.539a55e35
with build-idd1fb2faafd95058a04aad30b675ff7d2b930278d
Relocatable Package: http://downloads.scylladb.com/unstable/scylla-enterprise/enterprise-2022.1/relocatable/2022-09-22T13:36:03Z/scylla-enterprise-x86_64-package.tar.gz Cluster size: 6 nodes (i3.2xlarge)Scylla Nodes used in this run: No resources left at the end of the run
OS / Image:
ami-0f6aebcffc8f7aa66
(aws: eu-west-1)Test:
scylla-cloud-longevity-cdc-100gb-4h
Test id:31c49a40-cd31-4214-b469-54011c1aaaf0
Test name:siren-tests/longevity-tests/scylla-cloud-longevity-cdc-100gb-4h
Test config file(s):Issue description
By
2022-11-08 16:01:55.317
we started all the c-s threads.Then we ran a series of
nodetool cfstats
andnodetool flush
commands (no nemeses were started at that moment)cfstats
for audit keyspace (+flush
) - succeededThen
cfstats
for _cdctest keyspace (+flush
).flush
succeededcfstats
eventually failed:Approximately at this time we got
Reactor stalled
messages on several nodes:Also, a lot of the following error messages were found in the node logs:
Since nodetool command failed we triggered the coredump generation for scylla on node-1:
Coredump: node_1_scylla_coredump.txt
$ hydra investigate show-monitor 31c49a40-cd31-4214-b469-54011c1aaaf0
$ hydra investigate show-logs 31c49a40-cd31-4214-b469-54011c1aaaf0
Logs:
Jenkins job URL