cockroachdb / pebble

RocksDB/LevelDB inspired key-value database in Go
BSD 3-Clause "New" or "Revised" License
5k stars 460 forks source link

db: investigate unexplained ycsb/F/values=1024 perf improvement #2578

Closed jbowens closed 1 year ago

jbowens commented 1 year ago
Screenshot 2023-05-30 at 5 28 48 PM

We don't currently understand where this improvement came from. We should investigate it, ensure that we're not doing anything silly, and see if there were any regressions fixed that should then be backported.

jbowens commented 1 year ago

I downloaded the artifacts from an arbitrary date before the 16th and an arbitrary date after the 16th. Diffing the heap profiles:

go tool pprof -diff_base ../artifacts\ \(8\)/5.profiles_F/heap.prof -diff_base ../artifacts\ \(8\)/4.profiles_F/heap.prof -diff_base ../artifacts\ \(8\)/3.profiles_F/heap.prof -diff_base ../artifacts\ \(8\)/2.profiles_F/heap.prof -diff_base ../artifacts\ \(8\)/1.profiles_F/heap.prof --normalize ../artifacts\ \(7\)/4.profiles_F/heap.prof ../artifacts\ \(7\)/3.profiles_F/heap.prof ../artifacts\ \(7\)/2.profiles_F/heap.prof ../artifacts\ \(7\)/1.profiles_F/heap.prof
File: pebble
Build ID: 671855946c18cea2728f4e13a9784294e95304ed
Type: inuse_space
Time: May 11, 2023 at 10:22am (EDT)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) alloc_objects
(pprof) top20
Showing nodes accounting for 18045094, 25.47% of 70839746 total
Showing top 20 nodes out of 577
      flat  flat%   sum%        cum   cum%
  44212530 62.41% 62.41%   44212530 62.41%  github.com/cockroachdb/pebble/internal/manifest.(*L0Sublevels).addFileToSublevels
  -3660821  5.17% 57.24%   -3660821  5.17%  github.com/cockroachdb/pebble/sstable.(*Properties).saveUvarint (inline)
  -2458646  3.47% 53.77%   -2458646  3.47%  github.com/cockroachdb/pebble/sstable.(*rawBlockIter).readEntry
  -1873472  2.64% 51.13%   -1873472  2.64%  github.com/HdrHistogram/hdrhistogram-go.New
  -1593315  2.25% 48.88%   -1593315  2.25%  os.newFile
  -1548084  2.19% 46.69%   -4180215  5.90%  github.com/cockroachdb/pebble/sstable.(*Properties).load
  -1501729  2.12% 44.57%   -1501729  2.12%  strings.(*Builder).grow (inline)
  -1478203  2.09% 42.49%   -1478203  2.09%  github.com/cockroachdb/pebble/sstable.(*Properties).saveString (inline)
  -1324987  1.87% 40.62%   -1324987  1.87%  runtime/pprof.allFrames
  -1242971  1.75% 38.86%   -1242971  1.75%  syscall.ByteSliceFromString
  -1170309  1.65% 37.21%   -1193454  1.68%  fmt.Sprintf
  -1152856  1.63% 35.58%   -4961799  7.00%  github.com/cockroachdb/pebble/internal/base.MakeFilepath
  -1112943  1.57% 34.01%   -2307214  3.26%  github.com/cockroachdb/pebble/internal/base.MakeFilename
  -1007324  1.42% 32.59%   -7265232 10.26%  github.com/cockroachdb/pebble/sstable.(*Properties).save
   -982504  1.39% 31.20%   -1282321  1.81%  runtime/pprof.WithLabels
   -918369  1.30% 29.91%    -918369  1.30%  github.com/cockroachdb/pebble/vfs.wrapOSFileImpl
   -869147  1.23% 28.68%   -1441908  2.04%  github.com/cockroachdb/pebble/sstable.(*rawBlockWriter).add
   -828034  1.17% 27.51%   -9661022 13.64%  github.com/cockroachdb/pebble.(*tableCacheShard).findNode
   -812471  1.15% 26.36%  -30127363 42.53%  github.com/cockroachdb/pebble.(*DB).runCompaction
   -631251  0.89% 25.47%    -631251  0.89%  github.com/cockroachdb/pebble/internal/manifest.overlaps

Not sure what to make of that.

jbowens commented 1 year ago

At the end of an old run:

__level_____count____size___score______in__ingest(sz_cnt)____move(sz_cnt)___write(sz_cnt)____read___r-amp___w-amp
    WAL         1    51 M       -   148 G       -       -       -       -   149 G       -       -       -     1.0
      0      1684   1.8 G    1.97   149 G     0 B       0     0 B       0   146 G   112 K     0 B      33     1.0
      1         0     0 B    0.00     0 B     0 B       0     0 B       0     0 B       0     0 B       0     0.0
      2       277   1.0 G    2.17    85 G     0 B       0     0 B       0   158 G    40 K   158 G       1     1.8
      3       695   3.3 G    2.17   143 G     0 B       0   3.6 G   1.1 K   293 G    68 K   293 G       1     2.1
      4      1177    10 G    2.17   138 G     0 B       0   5.6 G   2.0 K   346 G    54 K   346 G       1     2.5
      5      1974    33 G    2.17   130 G     0 B       0   3.3 G   1.1 K   349 G    29 K   349 G       1     2.7
      6      2395   106 G       -   102 G     0 B       0     0 B       0   274 G   9.4 K   275 G       1     2.7
  total      8202   156 G       -   149 G     0 B       0    12 G   4.2 K   1.7 T   313 K   1.4 T      38    11.5
  flush      2415                             0 B       0       0  (ingest = tables-ingested, move = ingested-as-flushable)
compact     71636   151 G   236 M       3                          (size == estimated-debt, score = in-progress-bytes, in = num-in-progress)
  ctype     67388       0       0    4248       0       0       0  (default, delete, elision, move, read, rewrite, multi-level)
 memtbl         1    64 M
zmemtbl         0     0 B
   ztbl         0     0 B
 bcache     3.0 K    89 M    0.6%  (score == hit-rate)
 tcache         6   4.3 K   50.2%  (score == hit-rate)
  snaps         0       -       0  (score == earliest seq num)
 titers         4
 filter         -       -    0.0%  (score == utility)

At the end of a recent run:

__level_____count____size___score______in__ingest(sz_cnt)____move(sz_cnt)___write(sz_cnt)____read___r-amp___w-amp
    WAL         3   149 M       -   183 G       -       -       -       -   185 G       -       -       -     1.0
      0     42763    46 G    4.58   185 G     0 B       0     0 B       0   182 G   164 K   894 M     682     1.0
      1         0     0 B    0.00     0 B     0 B       0     0 B       0     0 B       0     0 B       0     0.0
      2      1801   9.2 G    4.58    78 G     0 B       0    36 K       1   522 G   109 K   522 G       1     6.7
      3      2603    15 G    4.58   123 G     0 B       0   5.3 G   1.5 K   239 G    52 K   239 G       1     1.9
      4      2366    23 G    4.58   106 G     0 B       0   8.1 G   2.2 K   239 G    38 K   239 G       1     2.3
      5      2156    37 G    4.37    88 G     0 B       0   3.8 G   1.1 K   197 G    18 K   198 G       1     2.2
      6      2042    60 G       -    57 G     0 B       0   2.0 K       1   135 G   5.2 K   136 G       1     2.4
  total     53731   190 G       -   185 G     0 B       0    17 G   4.9 K   1.7 T   385 K   1.3 T     687     9.2
  flush      2892                             0 B       0       0  (ingest = tables-ingested, move = ingested-as-flushable)
compact     48354   494 G   496 M       3                          (size == estimated-debt, score = in-progress-bytes, in = num-in-progress)
  ctype     43480       0       0    4874       0       0       0  (default, delete, elision, move, read, rewrite, multi-level)
 memtbl         3   192 M
zmemtbl         0     0 B
   ztbl         0     0 B
 bcache      17 K   497 M    0.7%  (score == hit-rate)
 tcache       284   209 K   50.2%  (score == hit-rate)
  snaps         0       -       0  (score == earliest seq num)
 titers        19
 filter         -       -    0.0%  (score == utility)
 ingest         0

Looks like we're "cheating" allowing much more accumulation of compaction debt. That explains https://github.com/cockroachdb/pebble/issues/2578#issuecomment-1572439183. Still don't understand why

jbowens commented 1 year ago

This is the result of some sort of infrastructure change (slower disks maybe?). I re-ran the benchmark for a SHA from the beginning of May ( 5271a3c047 ):

1.ycsb_F.log:Benchmarkycsb/F/values=1024 186478597  34533.1 ops/sec  1413026152369 read  1814606526123 write  0.00 r-amp  8.91 w-amp
2.ycsb_F.log:Benchmarkycsb/F/values=1024 184113062  34095.0 ops/sec  1427230330097 read  1826109239561 write  0.00 r-amp  8.97 w-amp
3.ycsb_F.log:Benchmarkycsb/F/values=1024 180986452  33516.0 ops/sec  1433865066344 read  1823183619471 write  0.00 r-amp  9.24 w-amp
4.ycsb_F.log:Benchmarkycsb/F/values=1024 187536888  34729.0 ops/sec  1416099895729 read  1821872470357 write  0.00 r-amp  8.80 w-amp
5.ycsb_F.log:Benchmarkycsb/F/values=1024 180320385  33392.7 ops/sec  1440856033309 read  1829040634218 write  0.00 r-amp  9.29 w-amp

I'll add a benchmark annotation and close this out.

jbowens commented 1 year ago

Running the older SHA with older versions of roachprod/roachtest produced the old results, so it does appear that something's changed in CockroachDB tooling. I'll create an issue in the Cockroach repository.