apache / pinot

Apache Pinot - A realtime distributed OLAP datastore
https://pinot.apache.org/
Apache License 2.0
5.53k stars 1.3k forks source link

Support for recursive queries (for graph / tree use-cases) #14568

Open hpvd opened 4 days ago

hpvd commented 4 days ago

Support for recursive queries to use Pinot for typical graph use cases

Many DBs already support recursive queries for several years. This enable many graph / tree like queries with good performance without the need for a dedicated graph database.

The with recursive clause is defined in ISO/IEC 9075-2:2023 §7.17 as part of optional feature T131

-> A good intro on

can be found here: https://www.linkedin.com/pulse/you-dont-need-graph-database-modeling-graphs-trees-viktor-qvarfordt-efzof/

databases with support for recursive queries are, e.g.

see https://modern-sql.com/caniuse/with_recursive_(top_level)

and https://en.wikipedia.org/wiki/Hierarchical_and_recursive_queries_in_SQL

=> Is there way to make this happen for Pinot too? Or is this hardly possible with our architecture?

hpvd commented 3 days ago

Motivation:

for using Pinot (or any other non graph native DB) for typical Graph-Usecases like

-> making it possible to build systems without having to use an additional database type like neo4j which saves lots of effort during dev and maintenance and even costs. In addition its much easier to build a team being expert in one tech stack than in two...

hpvd commented 3 days ago

Is this realistic from a performance pov in production?

imho for several usecases: YES!

When looking into fair comparisons / benchmarks of relational and graph databases where each one is optimized by using indexes, you are able to even find performance benefits for relational databases in several graph like uses-cases because they take much more advantage from indexes than graph DBs do.

Example: Performance of Graph and Relational Databases in Complex Queries https://www.mdpi.com/2076-3417/12/13/6490

And Pinot already proofed being pretty strong on benefitting from indexes...

hpvd commented 13 hours ago

from strategic USP POV: being able to efficiently use Pinot for production graph use-cases, may give a new unique solution, hardly to match in this combination by "real" graph DBs:

would be also a great extension to anomaly detection (third eye)...

hpvd commented 13 hours ago

on the long run, one could also look into integrating approaches like RelGo (kind of Calcite extension) to make even "Property Graph Queries" (PGQ) possible in highly efficient manner (>10x) (PGQs are now part of ISO SQL:2023 -> part 16 SQL/PGQ)

for RelGo approach, including benchmarks see: Towards a Converged Relational-Graph Optimization Framework https://arxiv.org/html/2408.13480v1