UWHustle / hustle

In-memory, columnar, arrow-based database.
Apache License 2.0
44 stars 7 forks source link
apache-arrow columnar database htap in-memory-database olap parallel-processing sqlite

Hustle

Hustle

Hustle is an efficient data platform, organized as a collection of data processing kernels, based on a micro services architecture, serving heterogeneous application query languages that map to core relational algebraic DSL.

Building blocks

Hustle is built on Apache Arrow.

Coding Guidelines

We follow these guidelines for development.

Build Hustle

User on MacOS shall install homebrew before running the following scripts.

To install the required packages for Hustle use the following scripts:

./install_requirements.sh
./install_arrow.sh

The scripts will install g++10, cmake 3.15 and Apache Arrow.

(macOS) Note: the default g++ will be still mapped to clang. To avoid such a case, we recommend using the following alias at startup script:

export CC=$(which gcc-10)
export CXX=$(which g++-10)

Then use cmake to build Hustle:

mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=RELEASE -DCMAKE_C_COMPILER=gcc-10 -DCMAKE_CXX_COMPILER=g++-10 .. 
make -j all  

To run the test go into the build directory and use:

./run_test.sh 

Run Benchmark

You can use the following commands to run the ssb benchmark, (before running the below commands make sure you have built the executable from the source files using the steps provided in the previous section).

To generate the ssb benchmark data,

sh ./scripts/ssb/gen_benchmark_data.sh ${SCALE_FACTOR}

(Usually scale factor can be 1 or 10).

To run the ssb benchmark,

sh ./scripts/ssb/run_benchmark.sh  ssb_queries

To run the tatp benchmark,

sh ./scripts/tatp/run_benchmark.sh 

Build Hustle with C++20 [Pilot]

Tested systems:

User on MacOS shall install homebrew before running the following scripts.

To install the required packages for Hustle use the following scripts:

./install_requirements_cpp20.sh
./install_arrow_cpp20.sh

The scripts will install g++10, cmake 3.15 and Apache Arrow 3.0.

To verify the toolchain accessibility, verify the version of g++:

g++-10 -v

Then use cmake to build Hustle:

mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=RELEASE -DCMAKE_C_COMPILER=gcc-10 \
 -DCMAKE_CXX_COMPILER=g++-10 -DCMAKE_CXX_STANDARD=20 \
 -DCMAKE_CXX_STANDARD_REQUIRED=True .. 
make -j all  

To run the test go into the build directory and use:

./run_test.sh