scylladb / scylla-cluster-tests

Tests for Scylla Clusters
GNU Affero General Public License v3.0
57 stars 95 forks source link

Add testcase to measure scale-in time while having 67% storage utilization #9275

Open Lakshmipathi opened 4 days ago

Lakshmipathi commented 4 days ago
Lakshmipathi commented 4 days ago

Reached 67% of disk usage

[2024-11-13T11:58:33.074Z] < t:2024-11-13 11:58:32,598 f:full_storage_utilization_test.py l:170  c:FullStorageUtilizationTest p:INFO  > Current max disk usage after writing to keyspace22: 67% (582 GB / 584.9100000000001 GB)

Waited for 30mins. Started throttled write

[2024-11-13T12:30:11.759Z] < t:2024-11-13 12:30:11,073 f:stress_thread.py l:325  c:sdcm.stress_thread   p:INFO  > cassandra-stress write no-warmup duration=30m -rate threads=32 "throttle=1400/s" -mode cql3 native -pop seq=1..5000000 -col "size=FIXED(10240) n=FIXED(1)" -schema keyspace=keyspace1 "replication(strategy=NetworkTopologyStrategy,replication_factor=3)" -node 10.4.0.238,10.4.2.191,10.4.1.250,10.4.3.105 -errors skip-unsupported-columns

Started removing a node from the cluster

[2024-11-13T12:34:06.808Z] < t:2024-11-13 12:34:05,588 f:full_storage_utilization_test.py l:70   c:FullStorageUtilizationTest p:INFO  > Started removing a node
[2024-11-13T12:34:06.808Z] < t:2024-11-13 12:34:05,589 f:full_storage_utilization_test.py l:182  c:FullStorageUtilizationTest p:INFO  > Removing a second node from the cluster
[2024-11-13T12:34:06.808Z] < t:2024-11-13 12:34:05,590 f:full_storage_utilization_test.py l:184  c:FullStorageUtilizationTest p:INFO  > Node to be removed: storage-utilization-master-db-node-8564666d-2

Wait for tablets to be balanced.

[2024-11-13T12:48:55.108Z] < t:2024-11-13 12:48:44,629 f:common.py       l:40   c:sdcm.utils.tablets.common p:INFO  > Waiting for tablets to be balanced
[2024-11-13T12:49:02.016Z] < t:2024-11-13 12:49:01,191 f:common.py       l:45   c:sdcm.utils.tablets.common p:INFO  > Tablets are balanced

Total time taken for removing a node on a 4-node cluster at 67% disk usage is:

[2024-11-13T12:49:02.016Z] < t:2024-11-13 12:49:01,192 f:full_storage_utilization_test.py l:74   c:FullStorageUtilizationTest p:INFO  > Removing a node finished with time: 895.6022570133209

https://argus.scylladb.com/tests/scylla-cluster-tests/8564666d-eca7-43d1-b56f-1ddf893f9c9a https://jenkins.scylladb.com/job/scylla-staging/job/LakshmipathiGanapathi/job/byo-longevity-test/218/console

Final 3-node cluster disk usage 87%,87% and 88%

Image