scylladb / scylla-bench

43 stars 36 forks source link

scylla-bench

scylla-bench is a benchmarking tool for Scylla written in Go. It aims at minimising the client overhead and provide a wide range of test scenarios.

Install

The recommended way to install scylla-bench is to download the repository and then install it from source:

git clone https://github.com/scylladb/scylla-bench
cd scylla-bench/
go install .

It is not recommended to download and install the tool directly using go get or go install. If you do that, a scylla-bench binary will be built without using ScyllaDB's fork of the gocql driver, and the shard-awareness won't work.

# If you use those commands, shard-awareness won't work!
# go get github.com/scylladb/scylla-bench
# go install github.com/scylladb/scylla-bench

This is due to the go tool not honoring replace directives in the go.mod file: https://github.com/golang/go/issues/30354

Usage

Schema adjustments

The default scylla-bench schema for regular columns looks like this:

CREATE TABLE IF NOT EXISTS scylla_bench.test (
    pk bigint,
    ck bigint,
    v blob,
    PRIMARY KEY(pk, ck)
) WITH compression = { }

scylla-bench allows configuring the number of partitions, number of rows in a partition and the size of a single row. This is done using flags -partition-count, -clustering-row-count and -clustering-row-size respectively.

Modes

scylla-bench can operate in several modes (flag -mode) which basically determine what kind of requests are sent to the server. Some of the modes allow additional, further configuration.

Write mode (-mode write)

The behaviour in this mode differs depending on the configured number of rows per requests. If -rows-per-request is set to 1 (default) scylla-bench sends simple INSERT requests like this:

INSERT INTO scylla_bench.test (pk, ck, v) VALUES (?, ?, ?)

Otherwise, writes are sent in unlogged batches each containing at most rows-per-request insertions. All writes in a single batch refer to the same partition. The consequence of this is that in some configuration the number of rows written in a single requests can be actually smaller than the set value (e.g. -clustering-row-count 2 -rows-per-request 4).

Counter update mode (-mode counter_update)

Counter updates are written to a separate column family:

CREATE TABLE IF NOT EXISTS scylla_bench.test_counters (
    pk bigint,
    ck bigint,
    c1 counter,
    c2 counter,
    c3 counter,
    c4 counter,
    c5 counter,
    PRIMARY KEY(pk, ck)
) WITH compression = { }

Each request updates all five counters in a row and only one row per request is supported:

UPDATE scylla_bench.test_counters SET c1 = c1 + 1, c2 = c2 + 1, c3 = c3 + 1, c4 = c4 + 1, c5 = c5 + 1 WHERE pk = ? AND ck = ?

Read mode (-mode read)

Read mode is essentially split into four sub-modes and offers most configurability. The default requests resemble single partition paging queries, there is a lower bound of clustering keys and a limit which is can be adjusted using flag rows-per-request:

SELECT * FROM scylla_bench.test WHERE pk = ? AND ck >= ? LIMIT ?

It is possible to send request without a lower bound if flag -no-lower-bound is set:

SELECT * FROM %s.%s WHERE pk = ? LIMIT ?

Limit can be replaced by upper bound (flag -provide-upper-bound). In this case scylla-bench will choose the upper bound so that the expected number of rows equals the one specified by rows-per-request.

SELECT * FROM %s.%s WHERE pk = ? AND ck >= ? AND ck < ?

Finally, scylla-bench can send a request with an IN restriction (flag -in-restriction). Again, the number of requested clustering keys will equal rows-per-request.

SELECT * from %s.%s WHERE pk = ? AND ck IN (?, ...)

Counter read mode (-mode counter_read)

Counter read mode works in exactly the same as regular read mode (with the same configuration flags available) except that it reads data from the counter table scylla_bench.test_counters.

Scan mode (-mode scan)

Scan the entire table. This mode does not allow the workload to be configured (it has its own workload called scan). The scan mode allows for the token-space to be split into a user configurable sub-ranges and for querying these sub-ranges concurrently. The algorithm used is that descibed by Avi's efficient range scans blog post. The amount of sub-ranges that the token-space will be split into can be set by the -range-count flag. The recommended number to set this to is:

-range-count = (nodes in cluster) ✕ (cores in node) ✕ 300

The number of sub-ranges to be read concurrency can be set by the -concurrency flag as usual. The recommended concurrency is:

-concurrency = range-count/100

For more details on these numbers see the above mentioned blog post.

Essentially the following query is executed:

SELECT * FROM scylla_bench.test WHERE token(pk) >= ? AND token(pk) <= ?

The number of iterations to run can be specified with the -iterations flag. The default is 1.

Workloads

The second very important part of scylla-bench configuration is the workload. While mode chooses what kind of requests are to be sent to the cluster the workload decides which partitions and rows should be the target of these requests.

Sequential workload (-workload sequential)

This workload sequentially visits all partitions and rows in them. If the concurrency is larger than one then the whole population is split evenly between goroutines. Sequential workload allows specifying the offset of the first partition in the population (flag -partition-offset) to enable sequential population of the database by multiple clients. For example, if we have three simultaneously running scylla-bench processes:

  1. scylla-bench -workload sequential -mode write -partition-count 5
  2. scylla-bench -workload sequential -mode write -partition-count 5 -partition-offset 5
  3. scylla-bench -workload sequential -mode write -partition-count 5 -partition-offset 10

The first loader will write partitions [0, 5), the second [5, 10) and the third [10, 15).

The sequential workload is useful for initial population of the database (in write mode) or warming up the cache for in-memory tests (in read mode). The number of iterations to run can be specified with the -iterations flag. The default is 1.

Uniform workload (-workload unifrom)

Uniform workload chooses the partition key and clustering key randomly with a uniform distribution.

scylla-bench requires that the maximum duration of the test is specified when running with uniform workload (e.g. -duration 1h).

Time series workload (-workload timeseries)

Time series workload is the most complex one and behaves differently depending whether scylla-bench is run in write or read mode.

Write mode

In write mode time series workload divides the set of partitions between all goroutines (it is required that -partition-count >= -concurrency). Then each goroutine prepends to its partitions (partitions are chosen in a round-robind manner) new rows. Newer rows have smaller clustering keys than the older ones.

Once the partition reches clustering-row-count rows the goroutine will switch to a new partiton key. This means that the total partition count will be larger than -partition-count, since in time series workload that flag specifies only the number of partitions to which data is concurrently written.

The rate at which rows depends on -max-rate flag which must be specified in this workload. Since -max-rate sets the total maximum request rate of the whole client the rate at which rows will be appended to a single partition may be lower. The acutal per-partition is printed in scylla-bench configuration as Write rate (it is concurrency / partition-count). scylla-bench also prints "Start timestamp" which is necessary if there is a time series read load running.

Read mode

Time series workload in read mode is supposed to be run simultanously with time series writes. It requires specifying the start timestamp -start-timestamp and per-partition write rate -write-rate both of which are printed by scylla-bench running in write mode.

The time series workload in read mode chooses partition and clustering keys randomly from the range that has been written up to this point (using start timestamp and write rate). The distribution can be either uniform (flag -distribution uniform) or half-normal with the latest rows being most likely (-distribution hnormal).

Note that if the effective write rate is lower than the specified one the reader may attempt to read rows that are not yet present in the database. However, because -max-rate doesn't just limit the rate but tries to make the average op/s equal the specified values it will be able to recover from small periodic dips in write throughput.

Other notable options

Random value distributions

scylla-bench supports random values for certain command line arguments. The list of these arguments is:

There are three distributions supported:

Example: -clustering-row-size=uniform:100..1000

All command line arguments that accept a random distribution, also accept a single number, in which case a Fixed distribution will be used. This ensures backward compatibility.

Examples

  1. Sequential write to populate the database: scylla-bench -workload sequential -mode write -nodes 127.0.0.1
  2. Read test: scylla-bench -workload uniform -mode read -concurrency 128 -duration 15m -nodes some_node
  3. Read latency test: scylla-bench -workload uniform -mode read -duration 15m -concurrency 32 -max-rate 32000 -nodes 192.168.8.4
  4. Counter write test: scylla-bench -workload uniform -mode counter_update -duration 30m -concurrency 128
  5. Full table scan test: scylla-bench -mode scan -timeout 5m -concurrency 1
  6. Write to populate database with non-zero data: scylla-bench -workload sequential -mode write -nodes 127.0.0.1 -clustering-row-size 16 -validate-data
  7. Read with data verification: scylla-bench -workload uniform -mode write -nodes 127.0.0.1 -clustering-row-size 16 -validate-data -duration 10m