Massive graph-structured data collections are ubiquitous in contemporary data management scenarios such as social networks, linked open data, and chemical compound databases.
The selection and manipulation of paths forms the core of querying graph datasets. Path indexing techniques can speed up this core functionality of querying graph datasets.
We propose a path-index based graph database engine.
The documentation can be found here and a schematic overview of the architecture can be found here.
Giedo Mak. Telepath: A path-index based graph database engine. MSc Thesis. Department of Mathematics and Computer Science, Eindhoven University of Technology. 2017. PDF
This section describes the essence of the life of a query within Telepath. Each heading contains links to its docs, test and source. In most cases, the test will give a clear insight into what each specific module produces.
Query input
The user gives a regular path query as input. For example:
a/(b/c)
Where a
, b
and c
are edge labels, and /
is interpreted as the concatenation logical operator.
Parse the input (docs) (test) (source)
The query input is parsed into our internal representation of a logical plan. Our internal representation uses a tree datastructure:
CONCATENATION
/ \
a CONCATENATION
/ \
b c
Generate the cheapest physical plan (docs) (test) (source)
Our planner uses the DPsize
algorithm as inspiration, which calculates the cheapest physical plan in a bottom-up fashion.
Since this phase is one of the main contributions, an in-depth explanation can be found here.
INDEX_LOOKUP
/ | \
a b c
Evaluate the physical plan
The physical plan is evaluated in a bottom-up fashion. All intermediate results are materialized through our MemoryManager (docs) (test) (source).
Using PathDB to gather the paths satisfying our query:
kPathIndex.search(
PathPrefix(
physicalPlan.pathIdOfChildren()
)
)
Visualize results
At the time of writing, results will be shown to the user through a command-line interface.
Telepath: >>>>> Results:
Telepath: Path(pathId=9, nodes=[Node(id=10), Node(id=12), Node(id=14)])
Telepath: Path(pathId=9, nodes=[Node(id=10), Node(id=12), Node(id=8772)])
Telepath: Number of results: 2, after 5 ms
Telepath: ----------------------------
The contributing guide is a good place to start. If you have questions, feel free to ask.
Giedo Mak | Max Sumrall | Nikolay Yakovets | George Fletcher |