scylladb / scylla-cluster-tests

Tests for Scylla Clusters
GNU Affero General Public License v3.0
58 stars 95 forks source link

Add testcase for scaling-out while having 90% storage utilization #9156

Open pehala opened 3 weeks ago

pehala commented 3 weeks ago
Lakshmipathi commented 2 weeks ago

3-node (Instance type: i4i.large) cluster scaleout at 90%.

reached 91% disk usage and started waiting for 30mins, no write or read.

< t:2024-11-03 07:10:58,323 f:full_storage_utilization_test.py l:93   c:FullStorageUtilizationTest p:INFO  > Current max disk usage after writing to keyspace10: 91% (396 GB / 392.40000000000003 GB)
< t:2024-11-03 07:10:59,353 f:full_storage_utilization_test.py l:58   c:FullStorageUtilizationTest p:INFO  > Wait for 1800 seconds

After 30min idle time, started throttled write:

< t:2024-11-03 07:42:10,941 f:file_logger.py  l:101  c:sdcm.sct_events.file_logger p:INFO  > stress_cmd=cassandra-stress write duration=30m -rate threads=10 "throttle=1400/s" -mode cql3 native -pop seq=1..5000000 -col "size=FIXED(10240) n=FIXED(1)" -schema "replication(strategy=NetworkTopologyStrategy,replication_factor=3)"

Scaleout by adding a new node at 90%

< t:2024-11-03 07:44:05,075 f:full_storage_utilization_test.py l:41   c:FullStorageUtilizationTest p:INFO  > Adding a new node

After 30mins, scaleout (3->4) cluster has disk usage at 75%, 74%, 75% and 70%

Tablet migration over time Image

max/avg disk utilization Image

Latency 99th percentile write and read latency by Cluster (max at 90% disk utilization)

syscall value
writes 3.07ms
read 1.79ms

https://argus.scylladb.com/tests/scylla-cluster-tests/c5de2f39-770c-4cf3-8d8c-66fef9d91d87

swasik commented 6 days ago

After 30mins, scaleout (3->4) cluster has disk usage at 75%, 74%, 75% and 70%

But I see that the chart presents average disk usage. This should change quickly as we are adding more disk space even if the new space is not used. Could you also add picture for maximal disk usage across all nodes?

swasik commented 2 days ago

The interesting fact is that after migration we have the same number of tablets everywhere but on the new node the disk utilization is ca. 5% lower. Maybe something is not cleaned yet. Could we wait a bit more time to see if the utilization will be equal in the end?

Lakshmipathi commented 1 day ago

Started new job, with 1hr wait time just before the test ends. Will check and update whether 5% lower disk usage still exists or not.

Lakshmipathi commented 1 day ago

@swasik After scaleout, waited for 40mins and ensured there is 0% load on all nodes. Final disk usage is: 66%, 69%, 71% and 73%. So on avg, the newly added node has 5% less disk usage than other 3-nodes.

pehala commented 1 day ago

Could it be due to tablet inbalance?

swasik commented 1 day ago

Could it be due to tablet inbalance?

I thought so too, but we have exactly the same number of tablets at each node and probably linear distribution of keyspace.