Closed adamreichold closed 2 months ago
I'm not seeing a speed-up in https://github.com/cberner/redb/blob/master/benches/lmdb_benchmark.rs
Is there a benchmark where you were able to measure a significant speed-up from this?
I'm not seeing a speed-up in https://github.com/cberner/redb/blob/master/benches/lmdb_benchmark.rs
Is there a benchmark where you were able to measure a significant speed-up from this?
This was not driven by benchmarks for me. I just stumbled over this while looking something else up in the code.
I would say that there is usually never a reason to use Arc<Vec<T>>
instead of Arc<[T]>
as the Arc
layer implies shared access. For redb, these pages are also never resized so that even with temporary mutable access, the unused capacity of the Vec
is not necessary. So for me, this was this was strictly a simplification.
Of course, the situation is a bit more complex here as the WritablePage
depends on mutable access through the Arc
which I would agree that it is not a trivial change.
So if there are benchmark changes, I would expect them to be positive, but this was not my motivation. I would sort any performance changes under "distributed fat" as I would expect increased TLB/cache pressure from the additional indirection and possible heap fragmentation.
Got it, I see. The first commit looks good then, but I don't want to add the unsafe
commit unless there's a really clear performance benefit. I'd like to remove the last of the unsafe code from redb, but need to get file lock support into std
to replace the libc
usage.
Want to remove the second commit, and I'll merge the first?
Want to remove the second commit, and I'll merge the first?
I did put that into a separate commit to make it easy to drop. I added it all because Arc:get_mut
is then called for each mutable page access and I feared that this would actually impact performance. Did you already run the benchmark without the last commit? If so, I'll just drop.
Otherwise, I'll do that and if it has an impact I'll try to figure something to avoid that without unsafe.
but need to get file lock support into std to replace the libc usage.
Not sure if you want to take on another dependency, but rustix
does have safe file locking: https://docs.rs/rustix/latest/rustix/fs/fn.flock.html (rustix
is really widely used and almost similar to libc
in scope. It is also already a transitive dev-dependency via tempfile
. So while not obvious, I think it might make sense to replace this.)
Not sure if you want to take on another dependency, but rustix does have safe file locking: https://docs.rs/rustix/latest/rustix/fs/fn.flock.html (rustix is really widely used and almost similar to libc in scope. It is also already a transitive dev-dependency via tempfile. So while not obvious, I think it might make sense to replace this.)
Ok, I tried but while it does work, there is still Windows which is out of scope for rustix
and F_BARRIERSYNC
on macOS which it does not yet support.
I dropped the last commit after running the lmdb_benchmark
on my notebook. The results are somewhat noisy so I think you'll want to run this again yourself to make sure. I do see small but consistent improvements for random reads, and without the second commit I rarely see small regressions for writes, but not always.
As I said, it is quite noisy and I am not sure what to make of it. (My hardware especially the disk are decidedly consumer-grade.)
Ah, I found a way to at least avoid the option dance without unsafe code at the cost of an otherwise unnecessary reference increment-decrement-pair. Pushed that as a separate commit.
Ok, I also reran the benchmark after changing it to run from a ramdisk to take my shitty reasonably-priced disk out of the picture and tease out the CPU overhead of managing the cache.
With that I am now reasonably confident that this does not introduce undue overhead and yields small but consistent improvements:
PR | master | |
---|---|---|
bulk load | 2673ms | 2791ms |
individual writes | 3ms | 3ms |
batch writes | 1291ms | 1299ms |
len() | 0ms | 0ms |
random reads | 1137ms | 1335ms |
random reads | 1026ms | 1203ms |
random range reads | 2864ms | 3080ms |
random range reads | 2861ms | 3069ms |
random reads (4 threads) | 275ms | 321ms |
random reads (8 threads) | 171ms | 193ms |
random reads (16 threads) | 148ms | 157ms |
random reads (32 threads) | 131ms | 137ms |
removals | 2000ms | 2069ms |
Merged. Thanks!
The second commit removes any per-access overhead toWritablePage
but at the cost of some unsafe code relying on the uniqueness that was asserted once by callingArc::try_unwrap
before this change.