mirage / index

A platform-agnostic multi-level index
MIT License
33 stars 20 forks source link

index: avoid unsafe buffer sharing in `Log_file` #358

Closed craigfe closed 3 years ago

craigfe commented 3 years ago

https://github.com/mirage/index/pull/355 introduced a small optimisation to re-use a "local" scratch buffer when decoding values from the log file.

This is actually unsafe: during the merge, the asynchronous merge thread and the main writer thread can attempt concurrent reads from the log file, causing contention over the scratch buffer. This can be observed by inserting Thread.yield calls inside the Value.decode implementation and then stress-testing the interface (e.g. by running the replay benchmarks).

This commit ensures that each call to a Log_file function gets its own scratch buffer, ensuring safe concurrent access without introducing potential contention issues.

_Note: I initially tried to just wrap t.scratch_buf in a mutex (https://github.com/mirage/index/pull/358/commits/0a7b05c85339943a02a9fc804976fdfa878999a5), but this was a ~5% performance regression in the replay benchmarks due to contention over the lock. The alternative of just allocating on entry to Log_file has a relatively small cost._

samoht commented 3 years ago

What's the mutex impact on the performance? Wouldn't it be better to have a scratch buffer by thread?

craigfe commented 3 years ago

@samoht: I took some measurements of that :-) Yes, that mutex has a ~5% performance impact on very write-intensive workloads like the replay benchmarks when the log size is large (but I couldn't see any impact for other workloads).

I've changed the implementation to allocate new buffers on entry to Log_file. It might be better to do things per-thread indeed, but we'd need an LRU to avoid leaking memory there. It's not sufficient to just keep separate threads for the writer + merge thread, because readers in different threads can contend for the buffer as well.