liftbridge-io / liftbridge

Lightweight, fault-tolerant message streams.
https://liftbridge.io
Apache License 2.0
2.57k stars 107 forks source link

Memory map for segment #178

Closed tekjar closed 4 years ago

tekjar commented 4 years ago

Hi. I see that segment uses normal buffered io where as index uses mmap. Any disadvantage in using mmap for segment as well?

tylertreat commented 4 years ago

The reason we mmap indexes is because mmap is particularly good for randomly accessing small pieces of a large file, but it's less good for reading large amounts of data. This is because mmap page faults when the contents being looked up are not in memory.

In contrast, normal read() and write() syscalls are super efficient when it comes to linear I/O and they typically don't have the same page fault issues as mmap. This is also better for multiple readers/writers, which is the case for log segments in Liftbridge. Here's a more in-depth explanation.

tekjar commented 4 years ago

Ahh thanks!! That's helpful