neo4j-labs / neosemantics

Graph+Semantics: Import/Export RDF from Neo4j. SHACL Validation, Model mapping and more.... If you like it, please ★ ⇧
https://neo4j.com/labs/neosemantics/
Apache License 2.0
809 stars 140 forks source link

BUG : neo4j looses triples, running partial commits leads to errors #297

Open AnkurArohi opened 1 year ago

AnkurArohi commented 1 year ago

After the neo4j starts properly and the inital CYPHER queries are worked upon there is a partial transaction rolled back issue and the tuples are getting lost.

This leads to the exception and the task which the neo4j was configured to do is incomplete.

With the version 4.4.4 we didnt face this issue

Steps to reproduce

Start the docker container with the correct settings for the neo4j DB and the following plugins

Neo4j-Apoc: 5.3.1-extended, apoc-5.3.0-core.jar, neosemantics-5.1.0.0.jar

and the following environment vars

server.directories.data=/dbdata server.jvm.additional=-Dlog4j2.formatMsgNoLookups=true server.jvm.additional=-Dlog4j2.disable.jmx=true server.jvm.additional=-XX:+ExitOnOutOfMemoryError server.directories.plugins=/plugins server.config.strict_validation.enabled=false

server.bolt.advertised_address=:7687 server.bolt.listen_address=:7687

dbms.security.procedures.unrestricted=algo.,apoc.

db.tx_log.rotation.retention_policy=false

server.memory.heap.initial_size=10g server.memory.heap.max_size=10g server.memory.pagecache.size=4g

Expected behavior

Actual behavior

Tuples are being lost and exception is being raised

Additionally, include (as appropriate) log-files, stacktraces, and other debug output.

2023-02-09 08:30:54.523+0000 ERROR Problems when running partial commit. Partial transaction rolled back. 3887 triples lost. │ │ org.neo4j.kernel.DeadlockDetectedException: ForsetiClient[transactionId=676, clientId=20] can't acquire UpdateLock{owners=ForsetiClient[transactionId=676, clientId=20], ForsetiClient[transactionId=650, clien │ │ Wait list:SharedLock[ │ │ Client[676] waits for [ForsetiClient[transactionId=650, clientId=22]], │ │ Client[650] waits for [ForsetiClient[transactionId=676, clientId=20]]] │ │ at org.neo4j.kernel.impl.locking.forseti.ForsetiClient.waitFor(ForsetiClient.java:839) ~[neo4j-lock-5.3.0.jar:5.3.0] │ │ at org.neo4j.kernel.impl.locking.forseti.ForsetiClient.tryUpgradeToExclusiveWithShareLockHeld(ForsetiClient.java:787) ~[neo4j-lock-5.3.0.jar:5.3.0] │ │ at org.neo4j.kernel.impl.locking.forseti.ForsetiClient.tryUpgradeSharedToExclusive(ForsetiClient.java:750) ~[neo4j-lock-5.3.0.jar:5.3.0] │ │ at org.neo4j.kernel.impl.locking.forseti.ForsetiClient.acquireExclusive(ForsetiClient.java:339) ~[neo4j-lock-5.3.0.jar:5.3.0] │ │ at org.neo4j.kernel.impl.api.parallel.ParallelAccessCheck$1.acquireExclusive(ParallelAccessCheck.java:62) ~[neo4j-kernel-5.3.0.jar:5.3.0] │ │ at org.neo4j.internal.recordstorage.RecordStorageLocks.acquireNodeLabelChangeLock(RecordStorageLocks.java:174) ~[neo4j-record-storage-engine-5.3.0.jar:5.3.0] │ │ at org.neo4j.kernel.impl.newapi.Operations.nodeAddLabel(Operations.java:382) ~[neo4j-kernel-5.3.0.jar:5.3.0] │ │ at org.neo4j.kernel.impl.core.NodeEntity.addLabel(NodeEntity.java:317) ~[neo4j-kernel-5.3.0.jar:5.3.0] │ │ at n10s.rdf.load.DirectStatementLoader.lambda$runPartialTx$1(DirectStatementLoader.java:63) ~[neosemantics-5.1.0.0.jar:5.1.0.0] │ │ at java.lang.Iterable.forEach(Iterable.java:75) ~[?:?] │ │ at

AnkurArohi commented 1 year ago

Reference : https://github.com/neo4j/neo4j/issues/13039

AnkurArohi commented 1 year ago

can someone please help with this issue thanks

jbarrasa commented 1 year ago

hi @AnkurArohi apologies for the late reply. Could you please share the cypher script you're running (including the RDF data) in order to try to reproduce the issue please?

Thanks.

AnkurArohi commented 1 year ago

@jbarrasa We have multiple cypher scripts I am not sure which one leads to this error but this one is an example

ne.run_query(
    f""" CREATE CONSTRAINT n10s_unique_uri IF NOT EXISTS
    FOR (r:Resource) REQUIRE r.uri IS UNIQUE """,
    log=log,
)
if len(ne.run_query("MATCH (n:_GraphConfig) RETURN id(n)", log=log)) == 0:
    ne.run_query(
        """
CALL n10s.graphconfig.init({typesToLabels: false,
                            nodeCacheSize:100000,
                            verifyUriSyntax:false,
                            commitSize:10000})""",
        log=log,
    )
AnkurArohi commented 1 year ago
ne.run_query(
    """
call apoc.periodic.iterate("MATCH (n)
WHERE NOT n:Ontology
 AND  NOT n:_GraphConfig
 RETURN n",
"DETACH DELETE n",
{batchSize:10000})
yield batches, total return batches, total
""",
    log=log,
)