neo4j-contrib / spatial

Neo4j Spatial is a library of utilities for Neo4j that faciliates the enabling of spatial operations on data. In particular you can add spatial indexes to already located data, and perform spatial operations on the data like searching for data within specified regions or within a specified distance of a point of interest. In addition classes are provided to expose the data to geotools and thereby to geotools enabled applications like geoserver and uDig.
http://neo4j-contrib.github.io/spatial
Other
781 stars 192 forks source link

More than one element Exception when using R-Tree index to insert WKT #308

Open kanghq opened 7 years ago

kanghq commented 7 years ago

When I using spatial.addWKT procedure to add WKT objects using multi threads in HA 4 cluster environment, the error occurred.

Neo4j 3.1.4 Enterprise HA 4 cluster RHEL7 Neo4j Spatial Plugin

geomencoder_config wkt ctime 1495172338674 geomencoder org.neo4j.gis.spatial.WKTGeometryEncoder layer_class org.neo4j.gis.spatial.EditableLayerImpl layer geom

org.neo4j.driver.v1.exceptions.ClientException: Failed to invoke procedure spatial.addWKT: Caused by: java.util.NoSuchElementException: More than one element in org.neo4j.cypher.internal.javacompat.ExecutionResult$ExceptionConversion@196a0242. First element is 'Node[179312]' and the second element is 'Node[208620]' at org.neo4j.driver.internal.connector.socket.SocketResponseHandler.handleFailureMessage(SocketResponseHandler.java:68) at org.neo4j.driver.internal.messaging.PackStreamMessageFormatV1$Reader.unpackFailureMessage(PackStreamMessageFormatV1.java:464) at org.neo4j.driver.internal.messaging.PackStreamMessageFormatV1$Reader.read(PackStreamMessageFormatV1.java:425) at org.neo4j.driver.internal.connector.socket.SocketClient.receiveOne(SocketClient.java:130) at org.neo4j.driver.internal.connector.socket.SocketConnection.receiveOne(SocketConnection.java:143) at org.neo4j.driver.internal.connector.ConcurrencyGuardingConnection.receiveOne(ConcurrencyGuardingConnection.java:164) at org.neo4j.driver.internal.pool.PooledConnection.receiveOne(PooledConnection.java:187) at org.neo4j.driver.internal.InternalStatementResult.consume(InternalStatementResult.java:259) at org.neo4j.jdbc.bolt.BoltPreparedStatement.executeUpdate(BoltPreparedStatement.java:65) at org.neo4j.jdbc.bolt.BoltPreparedStatement.execute(BoltPreparedStatement.java:75) at com.hqkang.SparkApp.core.DBHelper$2.call(DBHelper.java:161) at com.hqkang.SparkApp.core.DBHelper$2.call(DBHelper.java:1) at org.apache.spark.api.java.JavaRDDLike$$anonfun$foreachPartition$1.apply(JavaRDDLike.scala:219) at org.apache.spark.api.java.JavaRDDLike$$anonfun$foreachPartition$1.apply(JavaRDDLike.scala:219) at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:925) at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:925) at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1944) at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1944) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

craigtaverner commented 7 years ago

Is it possible to get the server logs and see if there are any server side errors correlated to this client side error?

kanghq commented 7 years ago

It looks like related to AutoIndexing?

Similar issue https://stackoverflow.com/questions/20797956/neo4j-cypher-query-returning-wrong-nodes

craigtaverner commented 7 years ago

I don't believe there is any autoindexing in the spatial library. Also the issue you linked to above is using the embedded API to access an auto-index with the Cypher 'START' clause, while your test is using Bolt to remotely access the spatial procedure layer.addWKT. The 'index' used there is not really an index, but a manually maintained graph structure. The nodes need to be manually added and manually removed from the structure using the layer.add methods. I still think we will only be able to figure out what is going on with server side logs, or better yet, a replicated test environment. Could you provide and sample code for reproducing this error?

kanghq commented 7 years ago

Hi, I figured out that error. I forgot to add the constraint CREATE CONSTRAINT ON (n:ReferenceNode) ASSERT n.name IS UNIQUE

When calling layer.addWKT, it calls getSpatialRoot function. In high concurrency scenario, there are two spatial_root nodes created. that's the error reason. I simply added that constraint and changed getReferenceNode to

public static Node getReferenceNode(GraphDatabaseService db, String name) {

        if (db != dbRef) {
            ReferenceNodes.dbRef = db;
        }
        Node res = null;
        Transaction tx = db.beginTx();
        try {

            Result result = db.execute("MERGE (ref:ReferenceNode {name:{name}}) RETURN ref", map("name", name));
            res = Iterators.single(result.<Node>columnAs("ref"));
            tx.success();

        } finally {
            tx.close();
        }
        return res;
    }

This bug has resolved. However I could not find a good way to add this constrain automatically.

craigtaverner commented 7 years ago

That was a very interesting analysis. Neo4j Spatial was not originally written for high concurrency scenarios, but it is great if it can slowly be made more concurrency safe. I am curious though if the changes to getReferenceNode are actually required? What happened when you had the uniqueness constraint, but not the extra transaction in that code?

The db.execute will create a transaction internally, and if a node is created it will be committed before the db.execute returns. With your change, the transaction is now created externally in your code, and the node is only created (transaction committed) when you call tx.close. The difference seems very subtle and I am not sure I see why it would make it more concurrency safe. Do you know?