ArangoDB-Community / arangodb-tinkerpop-provider

An implementation of the Tinkerpop OLTP Provider for ArangoDB
Apache License 2.0
84 stars 16 forks source link

[Questions] On loading GraphSON encoded graph. #58

Closed MartinBrugnara closed 3 years ago

MartinBrugnara commented 4 years ago

Hi all, I am trying to load a GraphSON encoded graph into ArangoDB. The loader fails citing that the labels are not in the collections.

My questions:

ArangoDB: 3.0.6 arangodb-tinkerpop-provider: 2.0.2

This is the code I am using to import the graph

String ds = "/data/tinkerpop-modern.json";
ArangoDBConfigurationBuilder builder = new ArangoDBConfigurationBuilder();
Graph g = builder.graph("MAIN_GRAPH");
GraphSONReader.build().create().readGraph(new FileInputStream(new File(ds)), g);
// Tested this too:
// g.io(IoCore.graphson()).readGraph(ds); 

The following is an example of error I get.

Exception in thread "main" java.lang.IllegalArgumentException: Vertex label (person) not in graph (MAIN_GRAPH) vertex collections.

Strangely enough, this is not deterministic: it does not always complain about the same Vertex, sometimes it does it about edges too.

Exception in thread "main" java.lang.IllegalArgumentException: Edge label (knows)not in graph (MAIN_GRAPH) edge collections.
arcanefoam commented 3 years ago

Hi, The Arango Tinkerpop implementation labels are modelled as collections, not as a vertex/edge property. The reasons for this were:

  1. The label defines the type of the vertex/edge, so a natural way to store this was using one collection per type.
  2. Using labels for types allowed the implementation to rely on Arango's native graph support.

However, a side effect of this is that the implementation looses flexibility as ArangoDB graphs require that the vertex and edge collections are known at graph creation.

A solution would be to use Anonymous Graphs, but that would require the tinkerpop provider to be responsible for graph integrity, a major redesign and code development.

The non-deterministic effect you see might be due to how the GraphSONReader process the information, but its not related to the missing schema information.