Demo workflow for Triggers now available See DEMO
AQuery++ Database is a cross-platform, In-Memory Column-Store Database that incorporates compiled query execution. (Note: If you encounter any problems, feel free to contact me via sunyinqi0508@gmail.com)
Recent version of Linux, Windows or MacOS, with recent C++ compiler that has C++17 (1z) support. (however c++20 is recommended if available for heterogeneous lookup on unordered containers)
Monetdb for Hybrid Engine
brew install monetdb
.Python 3.6 or above and install required packages in requirements.txt by python3 -m pip install -r requirements.txt
There're multiple options to run AQuery on Windows. But for better consistency I recommend using a simulated Linux environment such as Windows Subsystem for Linux (1 or 2), Docker or Linux Virtual Machines. You can also use the native toolchain from Microsoft Visual Studio or gcc from Winlabs/Cygwin/MinGW.
Windows Subsystem for Linux (WSL, Recommended)
python3 -m pip install -r requirements.txt
python3 ./prompt.py
to start AQueryFor Winlibs (Recommended):
mingw64/libexec/gcc/<arch>/<version>/liblto-plugin.dll
to mingw64/lib/bfd-plugins/
For Link time optimization support on gcc-ar and gcc-ranlib For Visual Studio:
python3 -m pip install -r requirements.txt
For CygWin/MinGW:
pacman -S gcc python3
). Otherwise, ABI breakage may happen.brew install python3 monetdb
xcode-select --install
or from homebrewexport CXX=clang
for arm64 macOS users
python
, C++ compiler
, monetdb
library and system commandline utilities such as uname
should have the same architecture. ./arch-check.sh
to check if relevant binaries all have the same architecture.apt update && apt install -y python3 python3-pip clang-14 libmonetdbe-dev git
)python3 -m pip install -r requirements.txt
export CXX=clang++-14
Note for anaconda users: the system libraries included in anaconda might differ from the ones your compiler is using. In this case, you might get errors similar to:
ImportError: libstdc++.so.6: version `GLIBCXX_3.4.26' not found
In this case, upgrade anaconda or your compiler or use the python from your OS or package manager instead. Or (NOT recommended) copy/link the library from your system (e.g. /usr/lib/x86_64-linux-gnu/libstdc++.so.6) to anaconda's library directory (e.g. ~/Anaconda3/lib/).
make docker
to build the docker image from scratch. docker buildx build --platform=linux/amd64 -t aquery .
instead of make docker
)docker run --name aquery -it aquery
) docker start -ai aquery
dbg
to activate python interpreter and type os.system('sh')
to launch a shell.source ./cims.sh
). Please use the source command or . ./cims.sh
(dot space) to execute the script because it contains configurations for environment variables. Also note that this script can only work with bash and compatible shells (e.g. dash, zsh. but not csh)python3 ./prompt.py
singularity build aquery.sif aquery.def
singularity exec aquery.sif sh
python3 ./prompt.py
python3 prompt.py
will launch the interactive command prompt. The server binary will be automatically rebuilt and started.
<sql statement>
: parse AQuery statementf <filename>
: parse all AQuery statements in fileexec
: execute last parsed statement(s) with Hybrid Execution Engine. Hybrid Execution Engine decouples the query into two parts. The standard SQL (MonetDB dialect) part is executed by an Embedded version of Monetdb and everything else is executed by a post-process module which is generated by AQuery++ Compiler in C++ and then compiled and executed.stats <OPTIONAL: options>
configure statistics.reset
: resets statistics.on
: statistics will be shown for every future query.off
: statistics will not be shown for every future query.script <filename>
: use automated testing script, this will execute all commands in the scriptsh <OPTIONAL: shell>
launch a shell. Shell name can be specified (e.g. sh fish
).dbg
start python interactive interpreter at the current context. print
: print parsed AQuery statements (AST in JSON form)save <OPTIONAL: filename>
: save current code snippet. will use random filename if not specified.exit
: quit the promptr
: run the last generated code snippet
f moving_avg.a
xexec
See files in ./tests/ for more examples.
script
command.script
commandtest.aquery
as an exampleAQuery++ has similar syntax to standard SQL with extensions for time-series analysis and user extensibility.
program : [query | create | insert | load | udf ]*
/********* Queries *********/
query : [WITH ID ['('columns')'] AS '(' single-query ')'] single-query
single-query : SELECT projections FROM datasource assumption where-clause groupby-clause
projections: [val as ID | val] (, [val as ID | val])*
datasource : ID [ID | AS ID] |
ID, datasource |
ID [INNER] JOIN datasource [USING columns | ON conditions] |
ID NATURAL JOIN datasource
order-clause: ASSUMING ([ASC|DESC] ID)+
where-clause: WHERE conditions;
groupby-clause: GROUP BY expr (, expr )* [HAVING conditions]
conditions: <a boolean expression>
/********* Creating data *********/
create: CREATE TABLE ID [AS query | '(' schema ')']
schema: ID type (, ID type)*
insert: INSERT INTO ID [query | VALUES '(' literals ')']
literals: literal (, literal)*;
/********* Loading/Saving data *********/
load: LOAD DATA INFILE string INTO TABLE ID FIELDS TERMINATED BY string
save: query INTO OUTFILE string FIELDS TERMINATED BY string
/********* User defined functions *********/
udf: FUNCTION ID '(' arg-list ')' '{' fun-body '}'
arg_list: ID (, ID)*
fun_body: [stmts] expr
/********* Triggers **********/
create: CREATE TRIGGER ID [ ACTION ID INTERVAL num | ON ID ACTION ID WHEN ID ]
drop: DROP TRIGGER ID
stmts: stmt+
stmt: assignment; | if-stmt | for-stmt | ;
assignment: l_value := expr
l_value: ID | ID '[' ID ']'
if-stmt: if '(' expr ')' if-body [else (stmt|block) ]
if-body: stmt | block (elif '(' expr ')' if-body)*
for-stmt: for '(' assignment (, assignment)* ';' expr ';' assignment ')' for-body
for-body: stmt|block
block: '{' [stmts] '}'
/********* Expressions *********/
expr: expr binop expr | fun_call | unaryop expr | ID | literal
fun: ID | sqrt | avg[s] | count | deltas | distinct
| first | last | max[s] | min[s] | next
| prev | sum[s] | ratios | <... To be added>
fun_call: fun '(' expr (, expr)* ')'
binop: +|-|=|*|+=|-=|*=|/=|!=|<|>|>=|<=| and | or
unaryop: +|-| not
literal: numbers | strings | booleans
STRING
and TEXT
are variable-length strings with unlimited length. VARCHAR(n)
is for strings with upper-bound limits.INT
and INTEGER
are 32-bit integers, SMALLINT
is for 16-bit integers, TINYINT
is for 8-bit integers and BIGINT
is 64-bit integers. On Linux and macOS, HGEINT
is 128-bit integers. REAL
denotes 32-bit floating point numbers while DOUBLE
denotes 64-bit floating point numbers. DATE
only supports the format of yyyy-mm-dd
, and TIME
uses 24-hour format and has the form of hh:mm:ss:ms
the milliseconds part can range from 0 to 999, TIMESTAMP
has the format of yyyy-mm-dd hh:mm:ss:ms
. When importing data from CSV files, please make sure the spreadsheet software (if they were used) doesn't change the format of the date and timestamp by double-checking the file with a plain-text editor.BOOLEAN
or BOOL
is a boolean type with values TRUE
and FALSE
.Tables can be created using CREATE TABLE
statement. For example
CREATE TABLE my_table (c1 INT, c2 INT, c3 STRING)
INSERT INTO my_table VALUES(10, 20, "example")
INSERT INTO my_table SELECT * FROM my_table
You can also create tables using a query. For example:
CREATE TABLE my_table_derived
AS
SELECT c1, c2 * 2 as twice_c2 FROM my_table
Tables can be dropped using DROP TABLE
statement. For example:
DROP TABLE my_table IF EXISTS
LOAD DATA INFILE <filename> INTO <table_name> [OPTIONS <options>]
data/q1.sql
for more information UNION ALL
is a bag union of two query results with same schema. e.g.
SELECT * FROM table 1 UNION ALL SELECT * FROM table 2
EXCEPT
clause will return the difference of two query results. e.g.DELETE FROM <table_name> [WHERE <conditions>]
to delete rows from a table that matches the conditions.stats
command described above.
stats
command without any argument will show the execution time of all queries executed so far.stats reset
will reset the timer for total execution time printed by stats
command above.stats on
will show execution time for every following query until a stats off
command is received.AQuery++ supports MonetDB passthrough for hybrid engine. Simply put standard SQL queries inside a \
Each query inside an sql block must be separated by a semicolon. And they will be sent to MonetDB directly which means they should be written in MonetDB dialect instead of AQuery dialect. Please refer to the MonetDB documentation for more information.
For example:
CREATE TABLE my_table (c1 INT, c2 INT, c3 STRING)
INSERT INTO my_table VALUES(10, 20, "example"), (20, 30, "example2")
<sql>
INSERT INTO my_table VALUES(10, 20, "example3");
CREATE INDEX idx1 ON my_table(c1);
</sql>
SELECT * FROM my_table WHERE c1 > 10
avg[s]
: average of a column. avgs(col), avgs(w, col)
is rolling and moving average with window w
of the column col
.var[s]
, stddev[s]
: [moving/rolling] population variance, standard deviation.sum[s]
, max[s]
, min[s]
: similar to avg[s]
ratios(w = 1, col)
: moving ratio of a column, e.g. ratios(w, col)[i]=col[i-w]/col[i]
. Window w
has default value of 1. next(col), prev(col)
: moving column back and forth by 1, e.g. next(col)[i] = col[i+1]
.first(col), last(col)
: first and last value of a column, i.e. first(col)= col[0]
, last(col) = col[n-1]
.sqrt(x), trunc(x), and other builtin math functions
: value-wise math operations. sqrt(x)[i] = sqrt(x[i])
pack(cols, ...)
: pack multiple columns with exact same type into a single column. mo-sql-parsing
Author: Kyle Lahnakoski
License (Mozilla Public License 2.0): https://github.com/klahnakoski/mo-sql-parsing/blob/dev/LICENSE
Fast C++ CSV Parser
Author: Ben Strasser
License (BSD 3-Clause License): https://github.com/ben-strasser/fast-cpp-csv-parser/blob/master/LICENSE
Dragonbox
Author: Junekey Jeon
License (Boost, Apache2-LLVM):
https://github.com/jk-jeon/dragonbox/blob/master/LICENSE-Boost
https://github.com/jk-jeon/dragonbox/blob/master/LICENSE-Apache2-LLVM
itoa
Author: James Edward Anhalt III
License (MIT): https://github.com/jeaiii/itoa/blob/main/LICENSE
MonetDB
License (Mozilla Public License): https://github.com/MonetDB/MonetDB/blob/master/license.txt
ankerl::unordered_dense
Author: Martin Ankerl
License (MIT): http://opensource.org/licenses/MIT