Add testcase for scaling-out while having 90% storage utilization

pehala commented 3 weeks ago

Create 3 node cluster with rf=3.
After reaching 90% disk usage.
Wait until cluster stabilizes
Perform scale-out.

Lakshmipathi commented 2 weeks ago

3-node (Instance type: i4i.large) cluster scaleout at 90%.

reached 91% disk usage and started waiting for 30mins, no write or read.

< t:2024-11-03 07:10:58,323 f:full_storage_utilization_test.py l:93   c:FullStorageUtilizationTest p:INFO  > Current max disk usage after writing to keyspace10: 91% (396 GB / 392.40000000000003 GB)
< t:2024-11-03 07:10:59,353 f:full_storage_utilization_test.py l:58   c:FullStorageUtilizationTest p:INFO  > Wait for 1800 seconds

After 30min idle time, started throttled write:

< t:2024-11-03 07:42:10,941 f:file_logger.py  l:101  c:sdcm.sct_events.file_logger p:INFO  > stress_cmd=cassandra-stress write duration=30m -rate threads=10 "throttle=1400/s" -mode cql3 native -pop seq=1..5000000 -col "size=FIXED(10240) n=FIXED(1)" -schema "replication(strategy=NetworkTopologyStrategy,replication_factor=3)"

Scaleout by adding a new node at 90%

< t:2024-11-03 07:44:05,075 f:full_storage_utilization_test.py l:41   c:FullStorageUtilizationTest p:INFO  > Adding a new node

After 30mins, scaleout (3->4) cluster has disk usage at 75%, 74%, 75% and 70%

Tablet migration over time

max/avg disk utilization

Latency 99th percentile write and read latency by Cluster (max at 90% disk utilization)

syscall	value
writes	3.07ms
read	1.79ms

https://argus.scylladb.com/tests/scylla-cluster-tests/c5de2f39-770c-4cf3-8d8c-66fef9d91d87

swasik commented 6 days ago

After 30mins, scaleout (3->4) cluster has disk usage at 75%, 74%, 75% and 70%

But I see that the chart presents average disk usage. This should change quickly as we are adding more disk space even if the new space is not used. Could you also add picture for maximal disk usage across all nodes?

swasik commented 2 days ago

The interesting fact is that after migration we have the same number of tablets everywhere but on the new node the disk utilization is ca. 5% lower. Maybe something is not cleaned yet. Could we wait a bit more time to see if the utilization will be equal in the end?

Lakshmipathi commented 1 day ago

Started new job, with 1hr wait time just before the test ends. Will check and update whether 5% lower disk usage still exists or not.

Lakshmipathi commented 1 day ago

@swasik After scaleout, waited for 40mins and ensured there is 0% load on all nodes. Final disk usage is: 66%, 69%, 71% and 73%. So on avg, the newly added node has 5% less disk usage than other 3-nodes.

pehala commented 1 day ago

Could it be due to tablet inbalance?

swasik commented 1 day ago

Could it be due to tablet inbalance?

I thought so too, but we have exactly the same number of tablets at each node and probably linear distribution of keyspace.

scylladb / scylla-cluster-tests

Add testcase for scaling-out while having 90% storage utilization #9156