ArangoDB-Community / arangodb-tinkerpop-provider

An implementation of the Tinkerpop OLTP Provider for ArangoDB
Apache License 2.0
84 stars 16 forks source link

add edge labels on schemaless gremlin graph #54

Closed henzesberger closed 3 years ago

henzesberger commented 4 years ago

Hi,

When running a simple query to create two new nodes and connect them with a special edge, I get an error that the edge doesn't exist.

gremlin> g.addV().addE("special-edge").to(g.addV())
Edge label (special-edge)not in graph (graph) edge collections.

The error occurs when the traversal engine calls the addEdge method of ArangoDBVertex.

I want to have a clean Gremlin interface, so I'd hope that you can just add the edge label (I see that there already is a ArangoDBGraph.schemaless flag) if it doesn't exist yet

arcanefoam commented 4 years ago

Hi, What do you mean by "clean" interface?

If you don'r provide desired vertex and edge labels via configuration, the default labels are vertex and edge . This is the "Schemaless" mode.

Due to how ArangoDB Graphs work, the graph's vertex and edge collections need to exist and be known during graph creation. As such, it is not possible to use "dynamic" vertex/edge lables once the graph has been created. Thus, in your case you would need to provide the edge label in the configuration:

gremlin.arangodb.conf.graph.edge = special-edge
henzesberger commented 4 years ago

I mean that I want an interface that supports gremlin traversals out of the box without any vendor-specific code. The configuration can be dependent on the database but shouldn't on the data.

Are there any plans to support this dynamic mode? Without it I guess that most of the Gremlin traversals wouldn't work without configuration.

I don't know the labels at graph creation time and also want to keep it that way

arcanefoam commented 4 years ago

From the ArangoDB docs:

Named Graphs

Named graphs are completely managed by ArangoDB, and thus also visible in the web interface. They use the full spectrum of ArangoDB’s graph features. You may access them via several interfaces.

Anonymous graphs

Sometimes you may not need all the powers of named graphs, but some of its bits may be valuable to you. ... Anonymous graphs don’t have edge definitions describing which vertex collection is connected by which edge collection. The graph model has to be maintained in the client side code. This gives you more freedom than the strict named graphs.

When to choose anonymous or named graphs?

As noted above, named graphs ensure graph integrity, both when inserting or removing edges or vertices. So you won’t encounter dangling edges, even if you use the same vertex collection in several named graphs. This involves more operations inside the database which come at a cost. Therefore anonymous graphs may be faster in many operations. So this question may be narrowed down to: ‘Can I afford the additional effort or do I need the warranty for integrity?’.

The Tinkerpop implementation relies on Named Graphs so we can warranty the integrity of the graph.

In order to provide a "dynamic mode" we would need to write all the integrity logic on the Tinkerpop code in order to use Anonymous Graphs. This will require a considerable effort. I would advise creating a new bug as a new functionality request: "Provide an AnonymousArangoDBGraph implementation to support generic gremlin traversals without requiring a schema". You are welcome to provide a patch for this, or encourage input form other community members to chip in :).

arcanefoam commented 3 years ago

Closing this as an answer was given to the question.