mutable-org / mutable

A Database System for Research and Fast Prototyping
https://mutable.uni-saarland.de
GNU Affero General Public License v3.0
96 stars 11 forks source link
compiler database optimization query sql webassembly

pipeline status Release License Documentation coverage report Benchmarks Discord

mutable

A Database System for Research

© Saarland University, 2023

What is mutable?

mutable is a relational main-memory database system. With mutable, our goal is to develop a database system that allows for efficient prototyping of research ideas while striving for performance that is competitive to state-of-the-art systems.

Defining Papers

Podcast Episode about mutable

Immanuel Haffner was guest on Disseminate: The Computer Science Research Podcast. Together with our host Jack Waudby, we discuss our CIDR 2023 paper and introduce the listener to mutable. [LINK TO EPISODE]

Features

Arbitrary Data Layouts A *domain-specific language* (DSL) to define *arbitrary* data layouts. The DSL constructs an internal representation, that enables mu*t*able to understand the data layout and to generate optimized code to access the data. #### Example Assume you want to create a table with the following schema: | id : INT(4) | name : CHAR(80) | salary : DOUBLE | |-|-|-| To create a **row-major** data layout for the table, in mu*t*able you can create a `DataLayout` object as shown below: ```cpp DataLayout layout; // fresh, empty layout auto &row = layout.add_inode( // create a row representation /* num_tuples= */ 1, // a row contains one tuple… /* stride_in_bits= */ 832 // see ① ); row.add_leaf( /* type= */ Type::Get_Integer(Type::TY_Vector, /* bytes= */ 4), /* idx= */ 0, // 0 ↔ `id` /* offset= */ 0, // offset of `id` within a row /* stride= */ 0 // not applicate, element not repeated within row ); row.add_leaf( /* type= */ Type::Get_Char(Type::TY_Vector, /* characters= */ 80), /* idx= */ 1, // 1 ↔ `name` /* offset= */ 32, /* stride= */ 0 ); row.add_leaf( /* type= */ Type::Get_Double(Type::TY_Vector), /* idx= */ 2, // 2 ↔ `salary` /* offset= */ 704, // 32 + 8 * 80 + 32 (pad to multiple of 64) /* stride= */ 0 ); row.add_leaf( /* type= */ Type::Get_Bitmap(Type::TY_Vector, /* bits= */ 3), // one bit per attribute /* idx= */ 3, // 3 ↔ "NULL bitmap" /* offset= */ 768, // 32 + 8 * 80 + 32 + 64 /* stride= */ 0 ); ``` ① The stride in bits for any `INode` must be specified at construction. The stride is chosen, such that every row is suitably aligned to fulfill *self-alignment* of all its descendants. Looking at the last leaf -- that for the NULL bitmap -- we see that the row must be at least 771 bits in size. The leaf with the largest *alignment requirement* inside a row is `salary` of type `DOUBLE` with an alignment requirement of 64 bits. Hence, we must ceil 771 to a whole multiple of 64 to accommodate sufficient space in a row *while* guaranteeing self-alignment of all leaves.

Alternatively, we can create a *partition attributes accross* (**PAX**) layout for the table. This is done in mu*t*able as shown below: ```cpp DataLayout layout; // fresh, empty layout auto &block = layout.add_inode( /* num_tuples= */ 704, // a PAX block contains 704 tuples… /* stride_in_bits= */ 512 * 1024 // and contains 512 KiB ); block.add_leaf( /* type= */ Type::Get_Integer(Type::TY_Vector, /* bytes= */ 4), /* idx= */ 0, // 0 ↔ `id` /* offset= */ 0, // offset of `id` column within a PAX block /* stride= */ 32 // `id`s repreated with 32 bits stride ); row.add_leaf( /* type= */ Type::Get_Char(Type::TY_Vector, /* characters= */ 80), /* idx= */ 1, // 1 ↔ `name` /* offset= */ 22'528, // 704 * 32 /* stride= */ 640 // `name`s repeated with 640 bits stride ); row.add_leaf( /* type= */ Type::Get_Double(Type::TY_Vector), /* idx= */ 2, // 2 ↔ `salary` /* offset= */ 473'088, // 704 * 32 + 704 * (8 * 80), already self-aligned /* stride= */ 64 // `salary`s repeated with 64 bits stride ); row.add_leaf( /* type= */ Type::Get_Bitmap(Type::TY_Vector, /* bits= */ 3), // one bit per attribute /* idx= */ 3, // 3 ↔ "NULL bitmap" /* offset= */ 518'144, // 704 * 32 + 704 * (8 * 80) + 704 * 64 /* stride= */ 8 // NULL bitmap repeated with stride of 8 bits ② ); ``` ② mu*t*able actually supports sub-byte strides, and we could in fact use 3 bits of stride for the NULL bitmap. However, while sub-byte strides save memory by avoiding padding, they complicate the data acces logic. In our example, we therefore opt for 8 bit stride to trade some unused bits for more efficient data accesses.

It is also possible to nest `INode`s of the `DataLayout` to arbitrary depths. This allows the creation of layouts such as *PAX-in-PAX* or *vertical partitioning*.

Cardinality Estimation with Sum-Product Networks We implemented *relational sum-product networks* (RSPNs), as proposed in [DeepDB](http://www.vldb.org/pvldb/vol13/p992-hilprecht.pdf), in mu*t*able. We achieve a relatively efficient implementation by implementing RSPN logic with [Eigen, a "C++ template library for linear algebra"](https://eigen.tuxfamily.org/index.php?title=Main_Page). After loading data into a database, you can manually trigger training of RSPNs with our built-in command `\learn_spns`. Documentation on the implementation of vanilla *sum-product networks* (SPNs) can be found [here](src/util/README.md) and the database specific extension making them RSPNs can be found [here](src/catalog/README.md). Disclaimer: Currently we have not yet implemented automatic updates of SPNs and string support. These are future tasks.

Injecting Cardinality Estimates mu*t*able provides a method to inject cardinality estimates into the system, that will then be used for query optimization. The cardinality estimates are specified in a JSON format, that is best described by an example. Consider the following example query: ```sql -- in database `demo` SELECT COUNT(*) FROM R, S, T WHERE R.sid = S.id AND S.tid = T.id AND R.x > 42 AND S.y < 13; ``` There are up to six intermediate results that can occur during join ordering, namely {$R$}, {$S$}, {$T$}, {$R,S$}, {$S,T$}, and {$R,S,T$}, omitting Cartesian products. We can specify the cardinality estimates that should be provided to the cardinality estimation component by providing the following JSON file to mu*t*able. ```json { "demo": [ { "relations": ["R"], "size": 9081}, { "relations": ["S"], "size": 8108}, { "relations": ["T"], "size": 361}, { "relations": ["R", "S"], "size": 14447050}, { "relations": ["S", "T"], "size": 7478}, { "relations": ["R", "S", "T"], "size": 3374320} ] } ``` When executing the above query in mu*t*able and providing the above cardinality JSON file, mu*t*able's query optimization will use these cardinalities to compute a plan. For example, the output for the above query could be ``` ProjectionOperator {[ COUNT() :INT(8) ]} ` AggregationOperator [COUNT()] {[ COUNT() :INT(8) ]} <1> ` JoinOperator (R.fid_R1 = S.id) {[ R.id :INT(4), R.fid_R1 :INT(4), S.id :INT(4), S.fid_R2 :INT(4), T.id :INT(4) ]} <3.37432e+06> ` ScanOperator (R AS R) {[ R.id :INT(4), R.fid_R1 :INT(4) ]} <9081> ` JoinOperator (S.fid_R2 = T.id) {[ S.id :INT(4), S.fid_R2 :INT(4), T.id :INT(4) ]} <7478> ` ScanOperator (S AS S) {[ S.id :INT(4), S.fid_R2 :INT(4) ]} <8108> ` ScanOperator (T AS T) {[ T.id :INT(4) ]} <361> ``` At the very end of each line, you find the cardinality estimates for the intermediate result produced by each operator, located in angular brackets `<,>`. To provide such a cardinality file to mu*t*able, use the following command line arguments: ```sh --cardinality-estimator Injected --use-cardinality-file "/path/to/file.json" ``` Intermediate results for which *no cardinality estimate* is specified will fall back to Cartesian product and print a warning to `stderr`.

Generating Fake Cardinalities Generating fake cardinality estimates (sometimes called fake statistics) is useful to steer, test, or evaluate the query optimization process. Therefore, we have built a tool `cardianlity_gen` that randomly generates cardinalities for all subsets of relations. These cardinalities are output in JSON, in the format that is suitable for cardinality injection (see our **Injecting Cardinality Estimates** feature). Our `cardinality_gen` takes as input a SQL file defining the database schema and a SQL query. First, it constructs the query's query graph. Along the graph, it then enumerates all subsets of joined relations using an efficient graph algorithm to enumerate all *connected subgraphs* (CSGs). This is done in a bottom-up fashion, e.g. the cardinality of {$A$,$B$} is generated *after* the cardinalities for $A$ and $B$ have been generated. Further, all cardinalities are generated to be *sane*, e.g. |$A\Join B$| must never be larger than $|A| * |B|$. The randomized generation of cardinalities with `cardinality_gen` can be steered through command line arguments. See `--help` for a complete list of parameters. Important mentions are `--alpha` to steer the skew of join selectivities and `--uncorrelated` to decide whether to generate pairwise independent (uncorrelated) join selectivities. Note, that pairwise independent join selectivities are practically impossible and with *too small* cardinalities, slight deviations from pairwise independent may manifest.

Just-in-Time Query Compilation We initially created mu*t*able to enable our research on *just-in-time* (JIT) compilation of SQL queries to WebAssembly (Wasm). By now, mu*t*able has a dedicated Wasm backend that performs very fast JIT compilation to Wasm and that delegates the generated Wasm code on to the embedded V8 engine for execution. V8 takes care of JIT compiling Wasm to machine code, of applying compiler optimizations, and of adaptively switching from unoptimized to optimized code while the query is running. See our EDBT 2023 paper for more information.

Code Generation DSL To relief the programmer from tediously writing query compilation steps or Wasm code generation directly, we have built a *deeply-embedded domain-specific language* (deep DSL). Our deep DSL is written in C++ and mimics C in syntax and semantics. Our deep DSL can be mixed with regular C++ code to allow for understandable and maintainable code generation through meta programming. This is best demonstrated by an example: ```cpp auto gen_power(int exp) { FUNCTION(power, int(int)) { Var res = 1; auto base = PARAMETER(0); // or `ThisFunction.parameter<0>()` while (exp > 0) { if (exp % 2 == 0) { // even exponent res = res * res; // b^(2n) = (b^n)² exp /= 2; } else { // odd exponent res *= base; exp -= 1; } } RETURN(res); } return power; // returns a handle to the generated `power` function } void demo() { auto pow5 = gen_power(5); Var a = pow5(2); // 2⁵ Var b = pow5(3); // 3⁵ auto pow8 = gen_power(8); Var c = pow8(2); // 2⁸ Var d = pow8(3); // 3⁸ } ``` Our DSL implements a type system that is less permissive than C/C++. It prevents mixing signed and unsigned types, implicit casts that reduce precision, and implicit casts to/from floating-point representation. Furthermore, our DSL supports *three-valued logic* (3VL): elements can be declared with template parameter `CanBeNull=true` to have ops perform 3VL. This is a big relief when implementing database operator logic. When attributes in a table are declared `NOT NULL`, then the code that would implement 3VL is actually never even generated.

Many Algorithms for Plan Enumeration We have implemented a broad spectrum of algorithms for join ordering / plan enumeration. Algorithms computing an optimal plan: - $\textit{DP}_\textit{size}$ - $\textit{DP}_\textit{sub}$ - $\textit{DP}_\textit{ccp}$ - $\textit{TD}_\textit{basic}$ - $\textit{TD}_\textit{MinCutAGaT}$ Algorithms computing a potentially suboptimal plan: - IK/KBZ (optimal on acyclic queries with pairwise independent join selectivities) - *greedy operator ordering* (GOO) - LinearizedDP: DP with search space linearization based on IK/KBZ In addition, we have developed a novel algorithm that is based on a reduction of the join order optimization problem to heuristic search. The plan enumeration algorithm is named `HeuristicSearch`. See our SIGMOD 2023 paper for more information.

Visual Representation mu*t*able can render some intermediate results during query processing as graphs in Graphviz, and also directly render them to PDF in case the necessary libraries are available on the system. Flags `--astdot`, `--graphdot`, and `--plandot` render the respective query representation in Graphviz. Then, if the Graphviz library is installed, the graph is directly rendered to PDF. Otherwise, the *dot* representation of the graph is output.

Contributors

See CONTRIBUTORS.md.

License

Our software "mutable" is licensed under AGPL v3. A copy of the license file can be found within this project. If not explicitly stated otherwise, this license applies to all files in this project.

Getting started

Important Links