BBVA / qed

The scalable, auditable and high-performance tamper-evident log project
https://qed.readthedocs.io/
Apache License 2.0
96 stars 19 forks source link

Remove index table #96

Closed aalda closed 5 years ago

aalda commented 5 years ago

We are using the index table to map from event hashes to history tree versions, but that responsibility should be exclusive of the hyper tree, given that now, it stores the raw version in the shortcut leaves.

In this manner, we could eliminate the need for using another table to support fast mappings. With this change, every membership operation must query first the hyper tree before generating the audit path from the history tree, and thus, incurs in a latency penalty. However, given that the hyper tree is the only one that holds a lock for queries, in theory, it shouldn't reduce balloon's throughput.

This change helps to reduce space and write amplification in storage.

aalda commented 5 years ago

These are the results obtained in the microbenchmarks before and after removing the index table. The benchmarks execute 1,000,000 operations in every run. The results are shown in nanoseconds per operation.

Add
(ns/op)
Sequential Query
(ns/op)
Parallel Query - 10
(ns/op)
Parallel Query - 100
(ns/op)
Without index 97773 213293 139587 96872
With index 101912 110294 92270 106911

We can observe that sequential queries almost duplicate latencies when removing the index table. This result was the expected given that we are now quering both trees (hyper and history) sequentially. However, as we increase parallelism, it gets better results. It also slightly improves Add latencies.