Open luleigreat opened 2 years ago
Can you reliably reproduce this deadlocked state? If so, can you include directions for doing so? Thanks.
The [workers] is configured 9, and there are totally 9 jobs running when deadlock occured. The directions for doing so: the server have 16 core cpus with 3.0Ghz frequency, and only configured workers to 9, I think the reason this deadlock occured is the cpu ability is far more stronger than the disk io, and leading to a lot of node cannot be written to disk immediatly.
Hi, all I have discovered a code that can cause deadlock:
If
mWriteSet.size() >= batchWriteLimitSize
condition met, all jobs in jobqueue may waiting for this condition variable, because the last timem_scheduler.scheduleTask
execute success, it just added a job, but cannot assure the job will execute immediately. If all jobs are locked(that's the scene I have encountered: all jobs are either waiting for theInboundLedger::update
lock or waiting for themWriteCondition
condition variable), and theperformScheduledTask
can never execute ,it will deadlock!I am working on rippled 1.6 ,and I have reviewed the relating code on rippled:develop,it seems the problem still lay there.
Is this problem resolved ? I think the condition variable usage (
mWriteCondition
) can be removed, is there a better resolution?