Closed ZhangJiaQiao closed 1 year ago
We have had a few people in the past ask for documentation as well. We are definitely aware that it needs work. An ARCHITECTURE.md
/whitepaper are definitely on the list of things to eventually get to. Unfortunately it can be hard prioritizing documentation over feature development and bug fixing. Do you have specific questions that maybe I can get answers to more quickly?
We have had a few people in the past ask for documentation as well. We are definitely aware that it needs work. An
ARCHITECTURE.md
/whitepaper are definitely on the list of things to eventually get to. Unfortunately it can be hard prioritizing documentation over feature development and bug fixing. Do you have specific questions that maybe I can get answers to more quickly?
Thanks for your reply. Here are some of my questions:
@nabeelmmd might be a good person to provide an answer about number 3.
RE 2: The docs (https://hse-project.github.io) contain detailed instructions
RE 3: Having a configurable sync interval both increases performance and aligns to a common config option on many DBs. You can use the sync() API to ensure any particular update is durable. All updates are atomic, which is different.
Thanks for your answers. Another question: HSE gets a great improvement on the operation latency and throughput, compared to WT and RocksDB . What technique does it use to achieve this? Are there any special optimization for NVMe/SATA SSD storage in HSE?
Many factors contribute to the performance gains. However the primary factors are 1) reduced write/read amplification resulting from our unique variant on LSM trees and associated compaction algorithms, and 2) a focus on highly concurrent data structures, including those based on RCU where applicable.
Got it. Thanks for your reply. I want to know more about HSE implementation from its code. There are other questions about the implementation and I have opened another issue in the Q&A Discussions.
Please offer some docs explaining the inner architecture of the HSE engine. I can not find any relative resources from the official site or the github repo. I want to learn the index architecture and key-value data management of the HSE.