oracle / pgql-lang

PGQL is an SQL-based query language for property graphs
https://pgql-lang.org/
Other
184 stars 44 forks source link

open cypher? #1

Closed tbrantb closed 8 years ago

tbrantb commented 8 years ago

The pgql spec has obvious similarities to the cypher query language. Instead of introducing another query language onto the graph db sector. Would it be possible to add some more weight behind the open cypher initiative? http://www.opencypher.org

Are there technical, legal, political reasons not to do so?

oskar-van-rest commented 8 years ago

Hi tbrantb,

I can only comment regarding the technical aspect.

There are several open issues with Cypher that PGQL tries to solve. Many of these issues have come forth from the work done over the past year by LDBC's GraphQL work force, which is a group of people from academia and industry (incl. Neo4j and Oracle) that are researching graph query langauges. Note that the work force does not endorse PGQL nor Cypher but gives independent recommendations for graph query language standardization. These recommendations were taken into account while designing PGQL.

Currently, the open issues with Cypher are as follows:

Pattern matching syntax in PGQL is indeed similar to Cypher ("ASCII-Art"). We found this a very natural way to specify patterns in graphs. However, we have tried to keep the rest of the syntax as close as possible to SQL because we believe this makes a lot of sense and is more appealing for existing SQL and SPARQL users.

Also, in the short term we plan to add additional functionality to PGQL that is currently not present in Cypher:

tbrantb commented 8 years ago

Thank you for your very informative response. This leads me to several things I'd like to research/learn.

On Jul 28, 2016 7:26 PM, "Oskar van Rest" notifications@github.com wrote:

Hi tbrantb,

I can only comment regarding the technical aspect.

There are several open issues with Cypher that PGQL tries to solve. Many of these issues have come forth from the work done over the past year by LDBC http://ldbcouncil.org/'s GraphQL work force, which is a group of people from academia and industry (incl. Neo4j and Oracle) that are researching graph query langauges. Note that the work force does not endorse PGQL nor Cypher but gives independent recommendations for graph query language standardization. These recommendations were taken into account while designing PGQL.

Currently, the open issues with Cypher are as follows:

-

The pattern matching semantic is not subgraph homomorphism https://en.wikipedia.org/wiki/Graph_homomorphism but a form of subgraph isomorphism https://en.wikipedia.org/wiki/Subgraph_isomorphism_problem that allows two query vertices to bind to the same data vertex (like homomorphism) but does not allow two query edges to bind to the same data edge. This means that certain SQL and SPARQL queries cannot be (naturally) expressed in Cypher. It also makes it more expensive to process queries and it especially blows up processing of recursive-style queries (e.g. -[:knows*]->) as Cypher would even enforce the semantic over the edges along paths or between edges from different conjunctive paths.

The (recursive-style) path finding semantic of Cypher is to find all trails (paths with non-repeating edges) between pairs of vertices. In addition, it supports single min-hop shortest path finding and all min-hop shortest path finding. The problem with finding all trails between a pair of vertices is that it has a runtime complexity that is exponential in the size of the graph and can thus easily blow up query processing even for small graphs. This problem does not exists for single shortest path finding. However, the kind of constraints that Cypher allows you to specify as part of a path pattern is very limited. For example, you cannot specify (inside the path pattern) that all the vertices along the path have the label Person or that all the edges along the path have a weight that is smaller than 100. As a result, you end specifying that you want to find all trails and filter out a valid shortest path afterwards, which is very expensive. Sometimes the query optimizer can take care of it and still process it efficiently, but this is generally very hard. Given that use cases typically don't require support for all trail finding, it is better that the query language supports reachability between pairs of vertices (like SPARQL) in combination with shortest path finding and at the same time allows for general filter expressions in path patterns that can be evaluated (efficiently) during traversal. Recursive-style path finding is the functionality that distinguishes graph query languages from SQL, so it is important to get it right.

Cypher only supports a subset of the class of Regular Path Queries (RPQs). This limits the user but also makes it hard to reason about the language from a theoretical perspective.

Cypher does not support query composition since subqueries are specified by chaining multiple queries such that the first queries have the WITH clause at the end while the last query has the RETURN clause at the end.

Pattern matching syntax in PGQL is indeed similar to Cypher ("ASCII-Art"). We found this a very natural way to specify patterns in graphs. However, we have tried to keep the rest of the syntax as close as possible to SQL because we believe this makes a lot of sense and is more appealing for existing SQL and SPARQL users.

  • PGQL has SELECT ... (FROM ...) WHERE ... while Cypher has MATCH ... WHERE ... WITH/RETURN
  • PGQL has separate SELECT and GROUP BY clauses while Cypher merges these two clauses into a single WITH/RETURN clause
  • PGQL has a HAVING clause while Cypher requires a subquery to filter out groups of solutions

Also, in the short term we plan to add additional functionality to PGQL that is currently not present in Cypher:

  • Graph construction and composition of queries that return graphs
  • Weighted shortest path finding
  • Multiple graphs as input to a query

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/oracle/pgql-lang/issues/1#issuecomment-236054513, or mute the thread https://github.com/notifications/unsubscribe-auth/AATQfCMwWm92XT6RLbp6UQhwqQfmTkWYks5qaTqSgaJpZM4JWt8t .