graphfoundation / ongdb

ONgDB is an independent fork of Neo4j® Enterprise Edition version 3.4.0.rc02 licensed under AGPLv3 and/or Community Edition licensed under GPLv3
https://www.graphfoundation.org/projects/ongdb/
382 stars 57 forks source link

Client Deadlocks #108

Open duhmojo opened 1 month ago

duhmojo commented 1 month ago

Guidelines

I have a student that implemented a strange Promise pattern that didn't accomplish what he thought it was doing, and the result was a DB deadlock:

Neo4jError: ForsetiClient[9] can't accquire ExclusiveLock{owner=ForsetiClient[2]} on NODE(25474), because holders of that lock are waiting for ForsetiClient[9].
 Wait list:ExclusiveLock[Client[2] waits for [9]]

The 13 queries he was firing took a variable amount of time, but are user facing queries typically so they aren't expected to be long running ones. e.g. if he was running 5 queries that write/read all the same time, and they each would normally take 5 seconds, I'd expect some kind of limit.

So I should probably implement a Deadlock error handling retry in my backend service to, within reason, see if it can field the write queries, correct?

Is there anything else I should consider? Maybe tune/configure ONGDB somehow?

How does the ONGDB/Neo4j query pointer work? Is a deadlock only possible when more than one query is trying to update the same Nodes? (shouldn't be the case where) Or when more than one query is trying to update anything at all or with the same Label/Type? Can I read unfettered, but writes are subject to Deadlocking/Blocking? What's the expectation here?

Thank you.

duhmojo commented 1 month ago

I don't know how to reproduce the problem with my own query. I can't really reproduce e my student's because its part of an application that has data, relationships, etc...

I wrote this and execute it 20 times in a loop without waiting. The app fires them off an still isn't done. The DB is using a lot of CPU, but my server/application would fine if slower, even with the high Java/DB CPU.

UNWIND range(1, 1000) AS i
    MERGE (p:TEST {name: 'Test ' + i, age: i % 100})
    SET p.test = i, p.blah = 'abc'
    with p, i
    MATCH (n:TEST {name: 'Test ' + ((i % 1000) + 1)})
    MERGE (p)-[t:TEST]->(n)
    WITH p
    SET p.age = toInteger(rand() * 50) + 20;

Under what conditions would a Deadlock arise?

duhmojo commented 1 month ago
         * Check if anyone holding this lock is currently waiting for the specified client. This
         * check is performed continuously while a client waits for a lock - if the check ever
         * comes back positive, it means we've deadlocked, because we are waiting for someone
         * (the holder of the lock) who in turn is waiting for us (so they won't release the lock).

How can/could I reproduce queries that are dependant on each other? I need to understand this to possibly untangle his query, and or add some Deadlock handling logic to the server.

Thanks.

duhmojo commented 1 month ago

Ok I can deadlock with this query:

UNWIND range(1, 10) AS i
    MERGE (p:TEST {name: 'Test ' + i, age: i % 100})
    with p, i
    MATCH (n:TEST {name: 'Test 1'})
    MERGE (p)-[t:TEST]->(n)-[tt:TEST]->(p)
    WITH p
    SET p.age = toInteger(rand() * 50) + 20;

I can at least write a handler for this case. Any info or advice is welcome.