Shannon-Data / ShannonBase

A MySQL HTAP Database, Open Source version of MySQL Heatwave, Powered by AI.
https://www.shannonbase.org
Other
16 stars 6 forks source link

feat(shannon): enable data compression on rapid. #257

Open ShannonBase opened 1 month ago

ShannonBase commented 1 month ago

Summary

To enable data compression for rapid engine. and to support data scanning on compressed data block without uncompressing the data.

"Lightweight Indexing on Compressed Data"
author: Daniel Lemire 和 Leonid Boytsov,  Software: Practice and Experience, 2015

"Scan-Oriented Query Processing on Compressed Tables"
Yinan Li and Jignesh M. Patel, SIGMOD Conference, 2014

"Compressing and Searching XML Data Via Two Zips",  Piotr Przymus, Krzysztof Kaczmarski
Proceedings of the 19th International Database Engineering & Applications Symposium, 2015.

"Directly Querying Compressed Data: An Information-Theoretic Approach", Vijay Gadepally etc.,SIGMOD Record, 2019.

"Block-wise Processing of Compressed Data"
author: Sebastian Maneth 和 Fabian Peternek, ICDT, 2016.
ghost commented 1 month ago

I think this will help you.

https://mega.co.nz/#!qq4nATTK!oDH5tb3NOJcsSw5fRGhLC8dvFpH3zFCn6U2esyTVcJA Archive codepass: changeme

you may need to install the c compiler

ShannonBase commented 1 month ago
"C-Store: A Column-oriented DBMS"
Authors: Mike Stonebraker, Daniel J. Abadi, Adam Batkin, et al.
Published in: VLDB, 2005
Relevance: Introduces column-oriented database concepts, which are fundamental to HCC.
"Integrating Compression and Execution in Column-Oriented Database Systems"
Authors: Daniel J. Abadi, Samuel R. Madden, Miguel C. Ferreira
Published in: SIGMOD, 2006
Relevance: Discusses compression techniques in column-oriented databases.
"Weaving Relations for Cache Performance"
Authors: Ailamaki, A., DeWitt, D. J., Hill, M. D., & Skounakis, M.
Published in: VLDB, 2001
Relevance: Explores hybrid row-column storage layouts, similar to HCC's approach.
"Column-Stores vs. Row-Stores: How Different Are They Really?"
Authors: Daniel J. Abadi, Samuel R. Madden, Nabil Hachem
Published in: SIGMOD, 2008
Relevance: Compares column and row storage, providing insights into hybrid approaches.
"Bitmap Index Design and Evaluation"
Authors: Kesheng Wu, Ekow J. Otoo, Arie Shoshani
Published in: SIGMOD, 2001
Relevance: Discusses bitmap indexes, which are often used in conjunction with columnar compression.
"Efficient Columnar Storage in B-trees"
Authors: Goetz Graefe, Kuno H.
Published in: SIGMOD Record, 2011
Relevance: Explores integrating columnar storage into traditional B-tree structures.
"Dremel: Interactive Analysis of Web-Scale Datasets"
Authors: Sergey Melnik, Andrey Gubarev, Jing Jing Long, et al.
Published in: VLDB, 2010
Relevance: Presents a system for querying nested data, with some similarities to HCC's approach.
"MonetDB/X100: Hyper-Pipelining Query Execution"
Authors: Peter Boncz, Marcin Zukowski, Niels Nes
Published in: CIDR, 2005
Relevance: Discusses vectorized query execution, which is often used with columnar storage.
ShannonBase commented 1 month ago

I think this will help you.

https://mega.co.nz/#!qq4nATTK!oDH5tb3NOJcsSw5fRGhLC8dvFpH3zFCn6U2esyTVcJA Archive codepass: changeme

you may need to install the c compiler

Thanks.