Closed vmx closed 10 years ago
I think it's still a fair comparison if RocksDB uses Snappy compression. The benchmarks for every storage engine should be trimmed to whatever is the fastest without being unfair. With "unfair" i mean things like a having one storage engine do fsyncs the other one not.
Hi Volker,
To be a fair comparison, not only RocksDB but also all the other DB modules should be modified to enable Snappy compression by default (all the other modules including ForestDB also support compression feature).
By the way, the reason why the compression is disabled by default is to exactly assess the effects of the document size. Compression changes the actual on-disk document size, and the size depends on the contents of the documents. The current benchmark produces document body as a simple repetition of the same character, so that the compressed document size will be very small whatever the original size is. By contrast, if a document body consists of uniform random bytes, the compressed size can hardly be smaller than the original size.
Producing variable-size document body keeping the compression ratio constant will be tricky, so it makes very hard to see the performance variation according to the document size (or the total working set size).
If you want to use compression feature, I recommend to add a new option in the benchmark config INI file, rather than making it as a default.
Thanks.
Hi Jung-Sang,
good points, I'll then adapt the other wrappers as well and make it an option.
Though I think using compression is what you would do in real world cases. So I would rather make the compression the default and change how the benchmarking works. For example with using document bodies that are more random (harder to compress). Would that make sense?
Cheers, Volker
Volker,
I don't think enabling the compression should be a default benchmark option. It should be up to an application and its use case. For example, some of Couchbase major customers asked us to disable the snappy compression because they saw the huge CPU overhead, but didn't see much compression gains.
However, we should definitely measure the benchmark by enabling the compression to provide more performance metrics.
Chiyoung
Agreed. I will close this pull request and create a new one which makes it a setting (when I find the time to).
Using Snappy improves the read and write performance on my machine. Reads are about 15% faster, writes about 50% faster.