Level / abstract-level

Abstract class for a lexicographically sorted key-value database.
MIT License
128 stars 8 forks source link

Benchmark against level #4

Closed vweevers closed 2 years ago

vweevers commented 3 years ago

Compare:

A quick benchmark is enough (of reads and writes). It's just to check that performance is equal or better.

vweevers commented 2 years ago

puts on level-mem + subleveldown versus memory-level + sublevels, with json encoding. Win.

put 1642345597518

puts on level-mem versus memory-level versus memory-level using strings internally. Double win.

put 1642346203027

vweevers commented 2 years ago

iterator.next() on level-mem versus memory-level, using json and utf8 valueEncodings. No difference (because the main cost is setImmediate).

iterate 1642346920934

vweevers commented 2 years ago

iterator.next() on level-mem versus iterator.nextv(1000) on memory-level. Not a fair benchmark, but the new nextv() API is an obvious win.

iterate 1642347677129

vweevers commented 2 years ago

iterator.next() on level versus iterator.next() on classic-level. Slower. I reckon that's because I changed the structure of the cache (in short: [entry, entry, ..] instead of [key, value, key, value, ..]) which should make nextv() faster. That'll be difficult to compare fairly.

iterate 1642373843003

vweevers commented 2 years ago

Batch puts on level-mem versus memory-level. Win.

batch-put 1643394415492

vweevers commented 2 years ago

Gets on level-mem versus memory-level. Win.

get 1643397969100

However, memory-level is slower when using a binary valueEncoding. That warrants a closer look.

vweevers commented 2 years ago

However, memory-level is slower when using a binary valueEncoding. That warrants a closer look.

It's not due to binary. Happens on any encoding when this code path is triggered:

https://github.com/Level/abstract-level/blob/d711af39de3126ee984d57df396a6d084ebfb748/abstract-level.js#L299-L301

V8 has a performance issue with the spread operator when properties are not present. The following "fixes" it:

options.keyEncoding = keyFormat
options.valueEncoding = valueFormat
options = { ...options, keyEncoding: keyFormat, valueEncoding: valueFormat }

As does using Object.assign() instead of spread:

get 1643406383068

Could switch to Object.assign() but I do still generally prefer the spread operator, for being idiomatic (not being vulnerable to prototype pollution could be another argument but I don't see how that would matter here).

vweevers commented 2 years ago

The same get() performance regression exists on classic-level. Using Object.assign() would fix it.

get 1643410282728

vweevers commented 2 years ago

Quick-and-dirty benchmark of streams, comparing nextv() to next(). Ref https://github.com/Level/community/issues/70 and https://github.com/Level/read-stream/pull/2.

Unrelated to abstract-level, but it's a win.

classic-level | using nextv() | took 1775 ms, 563380 ops/sec
classic-level | using nextv() | took 1577 ms, 634115 ops/sec
classic-level | using nextv() | took 1549 ms, 645578 ops/sec
classic-level | using nextv() | took 1480 ms, 675676 ops/sec
classic-level | using nextv() | took 1572 ms, 636132 ops/sec
                                 avg 1591 ms

level         | using next()  | took 1766 ms, 566251 ops/sec
level         | using next()  | took 1776 ms, 563063 ops/sec
level         | using next()  | took 1737 ms, 575705 ops/sec
level         | using next()  | took 1711 ms, 584454 ops/sec
level         | using next()  | took 1729 ms, 578369 ops/sec
                                 avg 1744 ms
vweevers commented 2 years ago

Did a better benchmark of streams. This one takes some explaining. In the graph legend below:

Where "byte-hwm" is the highWaterMark on the C++ side, measured in bytes. And "stream-hwm" is the highWaterMark of object-mode streams, measured in amount of entries.

That's about half of the explainer needed... In hindsight I wish I didn't do the abstract-level and nextv() work in parallel. So please allow me to just skip to conclusions (and later document how a user should tweak their options):

TLDR: we're good. Most importantly, the performance characteristics of streams and iterators did not change, in the sense that an app using smaller or larger values (I used 100 bytes) would be hurt by upgrading to abstract-level or classic-level. That's because leveldown internally already had two highWaterMark mechanisms; classic-level merely "hoists" one of them up to streams. So if an app has extremely large values, we will not prefetch more items than before. If an app has small values, we will not prefetch less than before. If an app is not using streams, iterators still do prefetch (as you can see later when I finally push all code).

stream 1643555843619

vweevers commented 2 years ago

and later document how a user should tweak their options

Done in https://github.com/Level/classic-level/pull/1