Sizing Up Your ScyllaDB Cluster

https://www.scylladb.com/2019/06/20/sizing-up-your-scylla-cluster/

How Big is Your Dataset, and How Fast is it Growing?

Your company has 5TB of raw data, and would like to use a replication factor of 3. 5TB Data X 3 RF = 15TB

Make space for compaction

When using the most common compaction strategy, known as Size Tiered Compaction Strategy (STCS), you need to set aside the same amount of storage for compaction as for the data, effectively doubling the amount of storage needed.

5TB (of raw Data) x 3 (for RF) x 2 (to support STCS) = 30TB

Throughput: Data Access Patterns

For this example we’ll use the following workload and access pattern for a hypothetical application:

A “mostly write” workload of 70,000 writes per second and 30,000 reads per second, resulting in a total of 100,000 operations per second.
Application developers require that, under a maximum load of 100,000 transactions per second:
- Writes 99th percentile: ≤10 milliseconds
- Reads 99th percentile: ≤15 milliseconds

Counting cores

Generally speaking, for payloads under 1 kilobyte, ScyllaDB can process ~12,500 operations per second for each physical core on each replica in the cluster.

To translate the above example to number of cores:

100,000 OPS 1KB (payload) 3 (Replicas) = 300,000 OPS 300,000 OPS / 12,500 (operations/core) = 24 physical cores (or 48 hyperthreads)

To translate the above example to number of cores:

100,000 OPS 1KB (payload) 3 (Replicas) = 300,000 OPS
300,000 OPS / 12,500 (operations/core) = 24 physical cores (or 48 hyperthreads) Using servers with 8 cores each, we end up with 3 servers needed for deployment

In many cloud deployments, nodes are provided on a vCPU basis. The vCPU is typically, a single hyperthread from a dual hyperthread x86 physical core. ScyllaDB shards per hyperthread (vCPU). Translating the above 24 physical cores to vCPUs will result in 48 vCPUs.

Use the following formula (for Intel Xeon Scalable Processors):

(Cores * 2) = vCPUs available for shards

The importance of memory

We recommend 30:1 as the optimal ratio of Disk:RAM size ratio and a maximum of 80:1 Disk:Ram ratio. In other words, for a 20TB storage in your node, we would recommend at least 256GB RAM, and preferably closer to 666GB RAM.

Translating numbers to real-world platforms

Let’s deploy ScyllaDB on Amazon Web Services

We will be using 3 nodes of i3.4xlarge for ScyllaDB deployment. Each i3.4xlarge has 16 vCPUs, 122GB of RAM and 3.8TB of NVMe based data storage. That gives us a total of 48vCPUs and 11.4 TB of disk storage.

The above deployment meets the criteria of throughput and latency. However, the deployment does not take into account maintenance windows, possible node failures and fat-finger errors. To be on the safe side, and to be able to maintain the workload at ease in case of any mayhem, we’d recommend at least 4 nodes of i3.4xlarge. (Oops)

voedger / kb