ArangoDB-Community / arangodb-tinkerpop-provider

An implementation of the Tinkerpop OLTP Provider for ArangoDB
Apache License 2.0
83 stars 16 forks source link

how to use this with cypher-for-gremlin? #55

Open vshesh opened 4 years ago

vshesh commented 4 years ago

Hi,

I like arangodb, but I'm in the need for a pattern match type query for a particular interface I'm designing. Cypher's query language seems like one way to get that to work, so I'm looking to integrate that with this library and open a channel where I can write Cypher queries and get results from arangodb.

I'm sure this is not something you may have intended to support, but I see no reason why it shouldn't work out of the box, other than I can't figure out how to connect both libraries through the console.

cypher-for-gremlin attaches to the :remote command in the gremlin console... how do i write the remote command to use arangodb?

something like :remote connect opencypher.gremlin conf/remote-objects.yaml translate gremlin (i think it would be translate gremlin33x in this case) from

https://github.com/opencypher/cypher-for-gremlin/tree/master/tinkerpop/cypher-gremlin-console-plugin

thanks.

dothebart commented 4 years ago

what do you hope to achieve by frankensteining cypher queries onto arangodb? Doesn't sound like a clever Idea to me in first place?

arcanefoam commented 4 years ago

@vshesh Sorry for the late reply. If ArangoDB is running and cypher uses a tinkerpop server (?) then the arango tinkerpop driver should work "out of the box" as you say. What have you tried and what is not working? Is this something you are still interested in?

vshesh commented 4 years ago

Thanks for responding! It's been a while on my end, but here's what I was asking for back then:

I'm working on an exploratory data analysis frontend, and I want to use ArangoDB as the database. In order to make that work, I need the ability to make arbitrary queries against the database to gather the data I want and then I have some frontend I'm writing to vizualize it in one of a few ways.

AQL is an imperative language in the sense that it requires me to write out the traversals (and I also have to write my queries out in text, which is cumbersome b/c syntax errors, etc) - which node/key to start at, what direction to traverse, breadth vs depth, and how to filter or prune paths.

I'm not particularly interested in Cypher as a solution, but I saw that Cypher has a declarative way of querying "give me data that looks like this" and that makes the UX of exploratory data analysis 100x better (can just ask for what I want without trying to remember AQL's syntax, or thinking about efficiency of queries, or what direction to set up the traversal).

I want to have a visual/graphical interface where the user can create a "graph" that represents their query. You can make a node, choose a type, set some properties, create links to other nodes, and that becomes your "query".

This would be impossible to do without some kind of MATCH clause in the query language, which Cypher has and AQL doesn't. I don't want to spend the time writing a translator from the declarative MATCH type query to traversal type queries... cypher-for-gremlin already does that. all that's left is to run the resulting gremlin query against the arangodb database which this tinkerpop provider is supposed to get me.


Another useful reason I want this capability is to do fast prototyping of web applications - the APIs aren't defined yet so I don't want to invest in writing out the traversals. It would be much easier/faster to use and quickly update/modify a "give me data like this" query than a "start here, traverse here, filter these things, stop there"

Hopefully that explains my motivations.


As for what I have tried:

$ arangod
$ bin/gremlin.sh
gremlin> :install org.opencypher.gremlin cypher-gremlin-console-plugin 1.0.4
==>Loaded: [org.opencypher.gremlin, cypher-gremlin-console-plugin, 1.0.4] - restart the console to use [opencypher.gremlin]
gremlin> :install org.arangodb arangodb-tinkerpop-provider 2.0.2
==>Loaded: [org.arangodb, arangodb-tinkerpop-provider, 2.0.2] - restart the console to use [tinkerpop.arangodb]
gremlin> :q
$ bin/gremlin.sh
gremlin> :plugin use opencypher.gremlin
==>opencypher.gremlin activated
gremlin> :plugin use tinkerpop.arangodb
==> tinkerpop.arangodb activated 

I've confirmed that I can open the graph manually:

gremlin> g = ArangoDBGraph.open(new PropertiesConfiguration("conf/arango.yaml"))
LOADED
==>arangodbgraph[{"name":"test","vertices":{"person", "book"},"edges":{"knows", "author", "reader"},"relations":{"knows:person->person", "created:person->book", "reader:person->book"}}]

And now I'm confused how to get the next part to work - how exactly do I use the arangodb tinkerpop provider in the remote call?:

gremlin> :remote connect opencypher.gremlin conf/arango.yaml translate gremlin33x
==>Error during 'connect' - Can't construct a java object for tag:yaml.org,2002:org.apache.tinkerpop.gremlin.driver.Settings; exception=No single argument constructor found for class org.apache.tinkerpop.gremlin.driver.Settings
 in 'reader', line 1, column 1:
    gremlin.graph = com.arangodb.tin ...

Where arango.yaml looks like:

gremlin.graph = com.arangodb.tinkerpop.gremlin.structure.ArangoDBGraph
gremlin.arangodb.conf.graph.db = info
gremlin.arangodb.conf.graph.name = test
gremlin.arangodb.conf.graph.vertex = person
gremlin.arangodb.conf.graph.vertex = book
gremlin.arangodb.conf.graph.edge = knows
gremlin.arangodb.conf.graph.edge = author
gremlin.arangodb.conf.graph.edge = reader
gremlin.arangodb.conf.graph.relation = knows:person->person
gremlin.arangodb.conf.graph.relation = created:person->book
gremlin.arangodb.conf.graph.relation = reader:person->book
gremlin.arangodb.conf.arangodb.hosts = localhost
gremlin.arangodb.conf.arangodb.user = gremlin
gremlin.arangodb.conf.arangodb.password = gremlin

I'm sure I'm probably doing something silly, so if someone could point me in the right direction I'd really appreciate it. It's been a long time so I don't remember everything I tried - I think I tried setting up a Gremlin Server, and connecting to that, tried using the two libraries by digging into them and writing Java code, but couldn't get anything to work.

I'm looking for an endpoint where I can send Cypher queries (or something similarly declarative) and get back responses from the ArangoDB server.

Thanks for your help!

vshesh commented 4 years ago

I remembered what I tried to do manually:

gremlin> g = ArangoDBGraph.open(new PropertiesConfiguration("conf/arango.yaml")).traversal(CypherTraversalSource.class)
gremlin> g.V("person/bob")
==>v[person.bob]
gremlin> g.cypher("MATCH (p:person) RETURN p")
gremlin>

Somehow the cypher query doesn't parse, unclear why this is failing. It doesn't throw an error, just returns zero records every time.

If I can get this working then maybe I could run a Gremlin Server with g.cypher(...) as an endpoint somehow (although not clear how to set that up either)

limowreck00 commented 2 years ago

vshesh has a valuable use case which makes perfect sense. Using Arangodb allow the user to search, filter, traverse and visualize the entire graph in a custom UI.

I am currently trying to do this in a jupyter notebook and hit a brick wall. But cannot use anything public hosted (i.e. aws neptune). Basically there is no way to replicate the aws library below using Arangodb. Is there?

https://github.com/aws/graph-notebook