ArangoDB-Community / arangodb-tinkerpop-provider

An implementation of the Tinkerpop OLTP Provider for ArangoDB
Apache License 2.0
83 stars 16 forks source link

ArangoDBConfigurationBuilder builds collection with graph-name prefix #40

Closed fdominik closed 5 years ago

fdominik commented 5 years ago

Hi, I have found, that if I configure the builder like this:

ArangoDBConfigurationBuilder builder = new ArangoDBConfigurationBuilder();
    builder.dataBase("analytics");
    builder.graph("graph")
        .withVertexCollection("document")
        .withVertexCollection("person")
        .withEdgeCollection("found_at")
        .configureEdge("found_at", "person", "document");

and later initiate it using:

BaseConfiguration conf = builder.build();
    Graph graph = GraphFactory.open(conf);

The collecitons mentioned above will be created using graph prefix. However I have already collections _document, person, foundat at my ArangoDB instance, but they have been created prior to using the Tinkerpop implementation and don't contain the prefix.

Would it be possible, that the tinkerpop provider would check the existence of collection also for "without_prefix" (maybe some configuration of the builder?), and use these collections if they are present? Because now it would mean, that I can't easily migrate from REST API over AQL to Javascript Gremlin client which would connect directly to it. (because I would have to refactor all AQLs and clients not to use 'foundat' as a collection name and type of the object, but add 'graph' prefix to it... :(

If you would help me where such change should occur I may help you with a Pull Request, but I am not yet orienting in the code of the tinkerpop provider.

arcanefoam commented 5 years ago

Hello,

The graph prefix was added so multiple graphs with conflicting collections could exist in the same database. But your case seems a good reason for allowing this behaviour to be configurable.

If you want to provide a patch, we would need both a configuration flag in the ArangoDBGraph class (an a setting method in the Builder) and some fixes in the code.

The code part is trickier. In the ArangoDBUtil class there is a 'getCollectioName' method that is responsible for the _ concatenation. Since this is a static method, probably the best thing is to move it to the ArangoDBGraph class so we have access to the new flag and know if we should prepend the graph name or not. Cant be 100% sure, but it might be case that not in all the places where the 'getCollectioName' method is called we have access to the graph instance.

Let me know if with this pointers you will attempt the patch so I dont work in parallel.

fdominik commented 5 years ago

Hmm interesting, because our project is leveraging Graphs, but we are accessing still the same collections. E.g. one is for generic search of users, but other Graph is more to search for how the vertices and edges were created (we create them automatically by NLP tools from texts...). So conflicting collections is not our case, this is even intended behaviour. If conflicting collections happen, I would rather use different ArangoDB for that case.

The pointer you gave me look legitimate, so I think I might be able to work on it. I have time today, so will try it and you can then check it :)

fdominik commented 5 years ago

Just a question, why would you put it to ArangoDBGraph? I am looking at the source code and not all Classes which are using the ArangoDBUtil.getCollectioName(String, String) have the instance of ArangoDBGraph. E.g. ArangoDBVertex in remove() method doesn't have any reference to ArangoDBGraph and in my opinion even shouldn't have. The same is, that ArangoDBUtil also has 11 usages of the getCollectionName , all these usages are also from static method. So I would probably just create an optional 3rd parametr saying if to make a prefix or not? Hmmm not sure now.

Edit: after getting into more details I even found, that some of the calls to getCollectionName will always be prefixed (e.g. the one for GRAPH_VARIABLES_COLLECTION or other)...

arcanefoam commented 5 years ago

Hi, The GRAPH_VARIABLES_COLLECTION is to store graph properties, which are not directly related to nodes and edges so I think this will not cause issues for your scenario.

I'll take a look at the patch and come back to you.

arcanefoam commented 5 years ago

This has been added to version 2.0.2