Closed vweevers closed 2 years ago
Initial batch benchmarks (on memory-level
) look good. If you're not using hooks or events, the hooks
branch of abstract-level
is faster than main
. If you are using events, db.batch()
with a write
event listener is 3-4% slower than db.batch()
with a batch
event listener (on either branch). Which is fair; the write
event has more data.
db.put()
performance is good too (after 75c75e2). The hooks
branch is faster than the main
branch if no events are used. As expected, it becomes slower when you use events or prehooks. In the table below, events.put=1
means the benchmark had one listener for the put
event. Similarly, hooks.prewrite=1
means one prewrite hook function, and hooks.prewrite=100
means it did:
for (let i = 0; i < 100; i++) {
db.hooks.prewrite.add(function () {})
}
$ level-bench plot put
benchmark put on memory-level@1.0.0 win32 x64
node@16.9.1 n=1M concurrency=4 valueSize=100B keys=random values=random
1 memory-level#hooks 36588 ops/s ±8.51% fastest
2 memory-level#main 35837 ops/s ±8.12% +1.70%
3 memory-level#hooks hooks.prewrite=1 35286 ops/s ±6.51% +1.75%
4 memory-level#main events.put=1 35508 ops/s ±7.47% +2.01%
5 memory-level#hooks events.write=1 34692 ops/s ±6.83% +3.69%
6 memory-level#hooks hooks.prewrite=100 34302 ops/s ±8.28% +6.05%
In classic-level
, adding a prewrite hook function has a bigger effect. Which is not a blocker for this PR but we may want to look into optimizing batches at some point.
$ level-bench plot put
benchmark put on classic-level@1.2.0 win32 x64
node@16.9.1 n=1M concurrency=4 valueSize=100B keys=random values=random
1 classic-level#main 30548 ops/s ±7.59% fastest
2 classic-level#hooks 30424 ops/s ±7.32% +0.16%
3 classic-level#hooks hooks.prewrite=1 28070 ops/s ±7.55% +8.08%
There's one remaining issue to fix (or not). If you do:
const data = db.sublevel('data')
const users = data.sublevel('users')
data.on('write', function (ops) {
const wrongKey = ops[0].key
})
data.batch().del('alice', { sublevel: users })
Then the wrongKey
emitted by the data
sublevel is !data!!users!alice
rather than !users!alice
. This is a result of how sublevels work in general and I don't yet have a solution.
I have a solution and a PoC implementation, but it'll hurt performance for nested sublevels. Given users = db.sublevel('data').sublevel('users')
, instead of users
forwarding its operations directly to db
, it'll forward to the data
sublevel which in turn forwards to db
. I.e. users.batch([])
calls data.batch([])
which calls db.batch([])
.
I have to benchmark that and see what tweaks can be made, but even if performance is significantly worse (and I think it will be) it might be worth it. Because it benefits both events and hooks: users.batch([])
would trigger the prewrite hook of users
, then of data
, then of db
. Same for the write
event. So, no matter what kind of database you have (sublevel or not, nested or not) it works the same. Which should benefit modularity.
It would make this PR semver-major, for two reasons:
db.sublevel(['data', 'users'])
to give users the ability to negate it.sublevel
option that isn't a descendant. So given a = db.sublevel('a')
and b = db.sublevel('b')
you can no longer do b.batch().del('1', { sublevel: a })
.In which case, I might just remove the batch
, put
and del
events rather than deprecating them.
@juliangruber @ralphtheninja any objections? The batch
, put
and del
events are 10 years old, so I don't want to take removing them lightly.
I have to benchmark that
Results for db.put()
on memory-level
, comparing no sublevel, 1 sublevel (!foo!
), 2 sublevels (!foo!!bar!
) and more:
At a depth of 2 sublevels, the difference between main
and hooks
is negligible. But it gets progressively worse the deeper you go. That's partially explained by having to copy longer prefixes, but main
has a more consistent performance between sublevel depths.
With support of db.sublevel(['foo', 'bar'])
(marked by flat
below) we can recover:
I've created a v2
branch as new base for this PR. Allows me to move ahead with items of https://github.com/Level/abstract-level/issues/47.
Sorry @vweevers, I don't have time to review this ATM :|
OK! Thanks for letting me know. FWIW I'll probably mark the hooks API as experimental (before v2 goes out the door) so there will be room for changes.
Adds postopen, prewrite and newsub hooks that allow userland "hook functions" to customize behavior of the database. See README for details. A quick example:
More generally, this is a move towards "renewed modularity". Our ecosystem is old and many modules no longer work because they had no choice but to monkeypatch database methods, of which the signature has changed since then.
So in addition to hooks, this:
write
event that is emitted ondb.batch()
,db.put()
anddb.del()
and has richer data: userland options, encoded data, keyEncoding and valueEncoding. Thebatch
,put
anddel
events are now deprecated and will be removed in a future version. Related to Level/level#222.db.batch(ops, options)
to ops, allowing for code likedb.batch(ops, { ttl: 123 })
to apply a default userlandttl
option to all ops.No breaking changes, yet. Using hooks means opting-in to new behaviors (like the new write event) and disables some old behaviors (like the deprecated events). Later on we can make those the default behavior, regardless of whether hooks are used.
TODO:
batch
argument of prewrite hook functionmemory-level
classic-level
Closes https://github.com/Level/community/issues/44.