scylladb / scylla-bench

43 stars 36 forks source link

scylla-bench uses to much memory for HDR latency histogram processing #84

Closed vponomaryov closed 2 years ago

vponomaryov commented 2 years ago

Following command works great using scylla-bench-v0.1.3:

scylla-bench -workload=timeseries -mode=write -replication-factor=3 \
    -partition-count=400 -clustering-row-count=10000000 -clustering-row-size=200 \
    -concurrency=100 -rows-per-request=100 -start-timestamp=1643207081874298013 \
    -connection-count 100 -max-rate 50000 --timeout 120s -duration=1m -error-at-row-limit 1000 -nodes 172.20.12.159

Example:

[root@ip-10-0-0-120 scylla-bench-0.1.3]# /$HOME/go/bin/scylla-bench -workload=timeseries -mode=write -replication-factor=3 -partition-count=400 -clustering-row-count=10000000 -clustering-row-size=200 -concurrency=100 -rows-per-request=100 -start-timestamp=1643207081874298013 -connection-count 100 -max-rate 50000 --timeout 120s -duration=1m -error-at-row-limit 1000 -nodes 172.20.12.159    
Configuration
Mode:                    write
Workload:                timeseries
Timeout:                 2m0s
Consistency level:       quorum
Partition count:         400
Clustering rows:         10000000
Clustering row size:     Fixed(200)
Rows per request:        100
Page size:               1000
Concurrency:             100
Connections:             100
Maximum rate:            50000 op/s
Client compression:      true
Start timestamp:         1643207081874298013
Write rate:              125

 time   ops/s  rows/s errors max    99.9th 99th   95th   90th   median mean   
   1s   50000   50000      0 6.3ms  4.7ms  2.3ms  1.7ms  1ms    754µs  828µs  
   2s   50059   50059      0 13ms   6.3ms  2.9ms  1.7ms  1.1ms  754µs  854µs  
   3s   50109   50109      0 11ms   5.3ms  2.4ms  1.7ms  1ms    754µs  835µs  
   4s   50100   50100      0 8.4ms  4.9ms  2.1ms  1.6ms  1ms    754µs  823µs  
   5s   50059   50059      0 9.4ms  5.7ms  2.4ms  1.8ms  1.1ms  786µs  859µs  
   6s   50090   50090      0 9.3ms  6.3ms  2.3ms  1.8ms  1ms    754µs  845µs  
   7s   50086   50086      0 7.8ms  6.4ms  2.3ms  1.8ms  1ms    754µs  848µs  
   8s   50038   50038      0 5.7ms  4ms    2.1ms  1.6ms  1ms    754µs  821µs  
   9s   50082   50082      0 9.3ms  5.8ms  2.7ms  1.8ms  1.1ms  754µs  855µs  
  10s   50027   50027      0 13ms   6.1ms  2.3ms  1.8ms  1.2ms  786µs  874µs  
  11s   50095   50095      0 8.5ms  5.5ms  2.3ms  1.9ms  1.3ms  786µs  872µs  
  12s   50023   50023      0 6.6ms  4.2ms  2.2ms  1.8ms  1.3ms  786µs  859µs  ^C

But with v0.1.4+ it gets OOMKilled in couple of seconds on VM with ~5Gb RAM:

[root@ip-10-0-0-120 scylla-bench-0.1.4]# /$HOME/go/bin/scylla-bench -workload=timeseries -mode=write -replication-factor=3 -partition-count=400 -clustering-row-count=10000000 -clustering-row-size=200 -concurrency=100 -rows-per-request=100 -start-timestamp=1643207081874298013 -connection-count 100 -max-rate 50000 --timeout 120s -duration=1m -error-at-row-limit 1000 -nodes 172.20.12.159 -hdr-latency-units us
Configuration
Mode:                    write
Workload:                timeseries
Timeout:                 2m0s
Consistency level:       quorum
Partition count:         400
Clustering rows:         10000000
Clustering row size:     Fixed(200)
Rows per request:        100
Page size:               1000
Concurrency:             100
Connections:             100
Maximum rate:            50000 op/s
Client compression:      true
Start timestamp:         1643207081874298013
Write rate:              125

 time   ops/s  rows/s errors max    99.9th 99th   95th   90th   median mean   
 1.1s   50000   50000      0 42ms   31ms   3.4ms  1.6ms  1.1ms  750µs  939µs  
 2.2s   51043   51043      0 40ms   13ms   3.6ms  1.5ms  1.1ms  750µs  876µs  
 4.1s   49262   49262      0 370ms  90ms   6.8ms  2.1ms  1.8ms  776µs  1.3ms  
 4.9s   57742   57742      0 339ms  96ms   24ms   2.2ms  1.5ms  896µs  1.6ms  
 5.4s   50665   50665      0 28ms   10ms   3.1ms  1.5ms  1.1ms  750µs  874µs  Killed
[root@ip-10-0-0-120 scylla-bench-0.1.4]# dmesg -T | egrep -i 'killed process'
...
[Wed Jan 26 16:58:22 2022] Memory cgroup out of memory: Killed process 10219 (scylla-bench) total-vm:16714348kB, anon-rss:5761060kB, file-rss:852kB, shmem-rss:0kB, UID:0 pgtables:13616kB oom_score_adj:-997

Note that here we have "microseconds" interval. If we set nano then we get this:

[root@ip-10-0-0-120 scylla-bench-0.1.4]# /$HOME/go/bin/scylla-bench -workload=timeseries -mode=write -replication-factor=3 -partition-count=400 -clustering-row-count=10000000 -clustering-row-size=200 -concurrency=100 -rows-per-request=100 -start-timestamp=1643207081874298013 -connection-count 100 -max-rate 50000 --timeout 120s -duration=1m -error-at-row-limit 1000 -nodes 172.20.12.159 -hdr-latency-units ns
Configuration
Mode:                    write
Workload:                timeseries
Timeout:                 2m0s
Consistency level:       quorum
Partition count:         400
Clustering rows:         10000000
Clustering row size:     Fixed(200)
Rows per request:        100
Page size:               1000
Concurrency:             100
Connections:             100
Maximum rate:            50000 op/s
Client compression:      true
Start timestamp:         1643207081874298013
Write rate:              125

 time   ops/s  rows/s errors max    99.9th 99th   95th   90th   median mean   Killed
[root@ip-10-0-0-120 scylla-bench-0.1.4]# dmesg -T | egrep -i 'killed process'
...
[Wed Jan 26 16:59:57 2022] Memory cgroup out of memory: Killed process 11291 (scylla-bench) total-vm:15690096kB, anon-rss:5762988kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:11536kB oom_score_adj:-997

And if we set ms then it starts working:

[root@ip-10-0-0-120 scylla-bench-0.1.4]# /$HOME/go/bin/scylla-bench -workload=timeseries -mode=write -replication-factor=3 -partition-count=400 -clustering-row-count=10000000 -clustering-row-size=200 -concurrency=100 -rows-per-request=100 -start-timestamp=1643207081874298013 -connection-count 100 -max-rate 50000 --timeout 120s -duration=1m -error-at-row-limit 1000 -nodes 172.20.12.159 -hdr-latency-units ms
Configuration
Mode:                    write
Workload:                timeseries
Timeout:                 2m0s
Consistency level:       quorum
Partition count:         400
Clustering rows:         10000000
Clustering row size:     Fixed(200)
Rows per request:        100
Page size:               1000
Concurrency:             100
Connections:             100
Maximum rate:            50000 op/s
Client compression:      true
Start timestamp:         1643207081874298013
Write rate:              125

 time   ops/s  rows/s errors max    99.9th 99th   95th   90th   median mean   
   1s   50000   50000      0 26ms   10ms   2ms    1ms    1ms    0s     144µs  
 2.1s   50146   50146      0 91ms   31ms   3ms    1ms    1ms    0s     246µs  
 3.1s   52051   52051      0 81ms   27ms   4ms    1ms    1ms    0s     349µs  
 4.1s   49184   49184      0 28ms   12ms   3ms    1ms    1ms    0s     237µs  
 5.1s   51053   51053      0 28ms   24ms   3ms    1ms    1ms    0s     235µs  
 6.1s   50060   50060      0 6ms    4ms    2ms    1ms    1ms    0s     163µs  
 7.1s   50009   50009      0 9ms    5ms    2ms    1ms    1ms    0s     154µs  
 8.1s   49863   49863      0 12ms   7ms    3ms    2ms    1ms    0s     249µs  
 9.1s   50311   50311      0 27ms   24ms   3ms    2ms    1ms    0s     289µs  
10.1s   50089   50089      0 10ms   6ms    2ms    1ms    1ms    0s     219µs  
11.1s   50048   50048      0 13ms   5ms    2ms    1ms    1ms    0s     170µs  
12.1s   48885   48885      0 29ms   13ms   4ms    1ms    1ms    0s     273µs  
13.1s   51379   51379      0 25ms   18ms   3ms    1ms    1ms    0s     258µs  
14.1s   50052   50052      0 7ms    4ms    2ms    1ms    1ms    0s     164µs  
15.1s   50062   50062      0 5ms    4ms    2ms    1ms    1ms    0s     161µs  
16.1s   50054   50054      0 8ms    6ms    2ms    1ms    1ms    0s     198µs  
17.1s   50060   50060      0 12ms   4ms    2ms    1ms    1ms    0s     182µs  
18.1s   50058   50058      0 9ms    6ms    2ms    1ms    1ms    0s     191µs  
19.1s   48738   48738      0 45ms   19ms   4ms    1ms    1ms    0s     253µs  
20.1s   51703   51703      0 46ms   20ms   4ms    1ms    1ms    0s     280µs  
21.1s   49908   49908      0 13ms   6ms    2ms    1ms    1ms    0s     180µs  
22.1s   50209   50209      0 13ms   8ms    3ms    1ms    1ms    0s     236µs  
23.1s   50040   50040      0 21ms   6ms    2ms    1ms    1ms    0s     200µs  
24.1s   50099   50099      0 9ms    5ms    2ms    1ms    1ms    0s     178µs  
25.1s   50040   50040      0 6ms    4ms    2ms    1ms    1ms    0s     169µs  
26.1s   50081   50081      0 7ms    4ms    2ms    1ms    1ms    0s     168µs  
27.1s   50057   50057      0 8ms    5ms    2ms    1ms    1ms    0s     167µs  
28.1s   50071   50071      0 7ms    5ms    3ms    1ms    1ms    0s     187µs  
29.1s   50057   50057      0 9ms    5ms    2ms    1ms    1ms    0s     182µs  
30.1s   50082   50082      0 13ms   5ms    2ms    1ms    1ms    0s     176µs  
31.1s   50048   50048      0 9ms    5ms    2ms    1ms    1ms    0s     179µs  
32.1s   50086   50086      0 11ms   5ms    3ms    1ms    1ms    0s     209µs  
33.1s   50053   50053      0 17ms   5ms    2ms    1ms    1ms    0s     196µs  
34.1s   50107   50107      0 13ms   6ms    2ms    1ms    1ms    0s     174µs  
35.2s   50026   50026      0 8ms    4ms    2ms    1ms    1ms    0s     190µs  
36.2s   50101   50101      0 10ms   6ms    2ms    1ms    1ms    0s     194µs  
37.2s   50087   50087      0 14ms   4ms    2ms    1ms    1ms    0s     172µs  
38.2s   50075   50075      0 9ms    6ms    2ms    1ms    1ms    0s     192µs  
39.2s   50048   50048      0 10ms   5ms    2ms    1ms    1ms    0s     182µs  
40.2s   50092   50092      0 8ms    4ms    2ms    1ms    1ms    0s     182µs  
41.2s   50056   50056      0 6ms    4ms    2ms    1ms    1ms    0s     178µs  
42.2s   50074   50074      0 6ms    5ms    2ms    1ms    1ms    0s     188µs  
43.2s   50064   50064      0 7ms    4ms    2ms    1ms    1ms    0s     159µs  
44.2s   50068   50068      0 7ms    5ms    2ms    1ms    1ms    0s     198µs  
45.2s   50072   50072      0 9ms    5ms    2ms    1ms    1ms    0s     183µs  
46.2s   50070   50070      0 7ms    5ms    2ms    1ms    1ms    0s     143µs  
47.2s   50041   50041      0 9ms    5ms    2ms    1ms    1ms    0s     160µs  
48.2s   50080   50080      0 10ms   5ms    2ms    1ms    1ms    0s     209µs  
49.2s   50065   50065      0 11ms   5ms    2ms    1ms    1ms    0s     168µs  
50.2s   50072   50072      0 8ms    6ms    2ms    1ms    1ms    0s     182µs  
51.2s   50059   50059      0 7ms    5ms    2ms    1ms    1ms    0s     168µs  
52.2s   50052   50052      0 8ms    5ms    2ms    1ms    1ms    0s     167µs  
53.2s   50050   50050      0 7ms    4ms    2ms    1ms    1ms    0s     198µs  
54.2s   49891   49891      0 13ms   7ms    2ms    1ms    1ms    0s     203µs  
55.2s   50258   50258      0 11ms   8ms    2ms    1ms    1ms    0s     220µs  
56.2s   49804   49804      0 20ms   7ms    3ms    1ms    1ms    0s     224µs  
57.2s   50289   50289      0 14ms   7ms    2ms    1ms    1ms    0s     216µs  
58.2s   50075   50075      0 12ms   5ms    2ms    1ms    1ms    0s     184µs  
59.2s   50072   50072      0 12ms   5ms    2ms    1ms    1ms    0s     174µs  
Results
Time (avg):      59.975812349s
Total ops:       2998700
Total rows:      2998604
Operations/s:    49998.489099784914
Rows/s:          49996.8884544678
raw latency :
  max:           91ns 
  99.9th:        7ms 
  99th:          2ms 
  95th:          1ms 
  90th:          1ms 
  median:        0s 
  mean:          198.092µs
c-o fixed latency :
  max:           126ns 
  99.9th:        81ms 
  99th:          17ms 
  95th:          2ms 
  90th:          1ms 
  median:        1ms 
  mean:          1.351329ms

So, something was missing from performance point of view in the PR with it's implementation: https://github.com/scylladb/scylla-bench/pull/75

vponomaryov commented 2 years ago

New version v0.1.7 improved the situation but still fails:

[root@ip-10-0-0-197 /]# cd scylla-bench-0.1.3
[root@ip-10-0-0-197 scylla-bench-0.1.3]# GO111MODULE=on go install .
go: downloading github.com/HdrHistogram/hdrhistogram-go v1.1.0
[root@ip-10-0-0-197 scylla-bench-0.1.3]# /$HOME/go/bin/scylla-bench -workload=timeseries -mode=write -replication-factor=3 -partition-count=400 -clustering-row-count=10000000 -clustering-row-size=200 -concurrency=100 -rows-per-request=100 -start-timestamp=1643207081874298013 -connection-count 100 -max-rate 50000 --timeout 120s -duration=10s -error-at-row-limit 1000 -nodes 172.20.175.50 
Configuration
Mode:                    write
Workload:                timeseries
Timeout:                 2m0s
Consistency level:       quorum
Partition count:         400
Clustering rows:         10000000
Clustering row size:     Fixed(200)
Rows per request:        100
Page size:               1000
Concurrency:             100
Connections:             100
Maximum rate:            50000 op/s
Client compression:      true
Start timestamp:         1643207081874298013
Write rate:              125

 time   ops/s  rows/s errors max    99.9th 99th   95th   90th   median mean   
   1s   49991   49991      0 8.2ms  5.4ms  2.2ms  1.1ms  950µs  754µs  797µs  
   2s   50091   50091      0 7.2ms  4.5ms  2.2ms  1.2ms  950µs  754µs  798µs  
   3s   50083   50083      0 8.6ms  6.2ms  2.6ms  1.4ms  950µs  754µs  819µs  
   4s   50090   50090      0 17ms   5.7ms  2.3ms  1.7ms  1ms    754µs  831µs  
   5s   50090   50090      0 7ms    5.4ms  2.5ms  1.7ms  1ms    754µs  837µs  
   6s   50060   50060      0 8.1ms  4ms    2.1ms  1.1ms  950µs  754µs  792µs  
   7s   50080   50080      0 9.3ms  7.4ms  2.2ms  1.6ms  1ms    754µs  821µs  
   8s   50049   50049      0 9ms    4.4ms  2.2ms  1.6ms  983µs  754µs  814µs  
   9s   50083   50083      0 11ms   4.9ms  2.3ms  1.6ms  983µs  754µs  816µs  
Results
Time (avg):      9.99690805s
Total ops:       499797
Total rows:      499797
Operations/s:    49995.15846389373
Rows/s:          49995.15846389373
raw latency :
  max:           17.399807ms 
  99.9th:        5.373951ms 
  99th:          2.228223ms 
  95th:          1.540095ms 
  90th:          983.039µs 
  median:        753.663µs 
  mean:          814.322µs
c-o fixed latency :
  max:           17.399807ms 
  99.9th:        6.979583ms 
  99th:          3.440639ms 
  95th:          2.064383ms 
  90th:          1.802239ms 
  median:        1.310719ms 
  mean:          1.345991ms
[root@ip-10-0-0-197 scylla-bench-0.1.3]# cd ..
[root@ip-10-0-0-197 /]# cd scylla-bench-0.1.4
[root@ip-10-0-0-197 scylla-bench-0.1.4]# GO111MODULE=on go install .
[root@ip-10-0-0-197 scylla-bench-0.1.4]# /$HOME/go/bin/scylla-bench -workload=timeseries -mode=write -replication-factor=3 -partition-count=400 -clustering-row-count=10000000 -clustering-row-size=200 -concurrency=100 -rows-per-request=100 -start-timestamp=1643207081874298013 -connection-count 100 -max-rate 50000 --timeout 120s -duration=10s -error-at-row-limit 1000 -nodes 172.20.175.50
Configuration
Mode:                    write
Workload:                timeseries
Timeout:                 2m0s
Consistency level:       quorum
Partition count:         400
Clustering rows:         10000000
Clustering row size:     Fixed(200)
Rows per request:        100
Page size:               1000
Concurrency:             100
Connections:             100
Maximum rate:            50000 op/s
Client compression:      true
Start timestamp:         1643207081874298013
Write rate:              125

 time   ops/s  rows/s errors max    99.9th 99th   95th   90th   median mean   Killed
[root@ip-10-0-0-197 scylla-bench-0.1.4]# cd ..
[root@ip-10-0-0-197 /]# cd scylla-bench-0.1.7
[root@ip-10-0-0-197 scylla-bench-0.1.7]# GO111MODULE=on go install .
go: downloading github.com/scylladb/gocql v1.5.0
[root@ip-10-0-0-197 scylla-bench-0.1.7]# /$HOME/go/bin/scylla-bench -workload=timeseries -mode=write -replication-factor=3 -partition-count=400 -clustering-row-count=10000000 -clustering-row-size=200 -concurrency=100 -rows-per-request=100 -start-timestamp=1643207081874298013 -connection-count 100 -max-rate 50000 --timeout 120s -duration=10s -error-at-row-limit 1000 -nodes 172.20.175.50
Configuration
Mode:                    write
Workload:                timeseries
Timeout:                 2m0s
Consistency level:       quorum
Partition count:         400
Clustering rows:         10000000
Clustering row size:     Fixed(200)
Rows per request:        100
Page size:               1000
Concurrency:             100
Connections:             100
Maximum rate:            50000 op/s
Client compression:      true
Start timestamp:         1643207081874298013
Write rate:              125
Hdr memory consumption:  3355470400 bytes

 time   ops/s  rows/s errors max    99.9th 99th   95th   90th   median mean   
   1s   50000   50000      0 24ms   9ms    2.1ms  1.1ms  1ms    754µs  806µs  
   2s   50343   50343      0 23ms   11ms   3.2ms  1.7ms  1.1ms  754µs  866µs  
   3s   50053   50053      0 9.3ms  6.8ms  2.5ms  1.2ms  1ms    754µs  804µs  
 4.3s   49847   49847      0 214ms  69ms   4ms    1.7ms  1.1ms  754µs  1ms    
 5.3s   51183   51183      0 207ms  80ms   21ms   1.8ms  1.2ms  786µs  1.4ms  
 6.3s   54829   54829      0 93ms   44ms   16ms   1.6ms  1.1ms  786µs  1.2ms  
 7.3s   48432   48432      0 55ms   19ms   4.7ms  1.6ms  1.1ms  754µs  924µs  Killed
dkropachev commented 2 years ago

Added option to control discreteness of hdr histogram - https://github.com/scylladb/scylla-bench/pull/87

dkropachev commented 2 years ago

Should be fixed at https://github.com/scylladb/scylla-bench/pull/87 and https://github.com/scylladb/scylla-bench/pull/85