sofastack / sofa-jraft

A production-grade java implementation of RAFT consensus algorithm.
https://www.sofastack.tech/projects/sofa-jraft/
Apache License 2.0
3.52k stars 1.12k forks source link

Deadlock on configuration application in NodeImpl when disruptors are full #1105

Open alievmirza opened 1 month ago

alievmirza commented 1 month ago

Describe the bug

There is a deadlock in NodeImpl when working with full LogManagerImpl#diskQueue, FSMCallerImpl#taskQueue and NodeImpl#writeLock.

  1. NodeImpl#executeApplyingTasks() takes NodeImpl.writeLock and calls LogManager.appendEntries()
  2. LogManager tries to enqueue a task to diskQueue which is full, hence it blocks until a task gets consumed from diskQueue
  3. diskQueue is consumed by StableClosureEventHandler
  4. StableClosureEventHandler tries to enqueue a task to FSMCallerImpl#taskQueue, which is also full, so this also blocks until a task gets consumed from FSMCallerImpl#taskQueue
  5. FSMCallerImpl#taskQueue is consumed by ApplyTaskHandler
  6. ApplyTaskHandler calls NodeImpl#onConfigurationChangeDone(), which tries to take NodeImpl#writeLock

As a result, there is a deadlock: NodeImpl#writeLock -> LogManager#diskQueue -> FSMCallerImpl#taskQueue -> NodeImpl#writeLock (disruptors are used as blocking queues in JRaft, so, when full, they act like locks).

This was caught by com.alipay.sofa.jraft.core.NodeTest#testNodeTaskOverload which uses extremely short disruptors (2 items max each).

Steps to reproduce

Run com.alipay.sofa.jraft.core.NodeTest#testNodeTaskOverload in a loop several times, for my local machine it is reproducible within 50-100 runs.

Environment

fengjiachun commented 1 month ago

Thank you for your feedback.

Could you provide the test code for reproduction? I couldn't replicate the issue on my machine. So, could please add a related test and send a PR?

alievmirza commented 1 month ago

Oh, sorry, I've sent the wrong name for the base test class (fixed that in the description), the test is com.alipay.sofa.jraft.core.NodeTest#testNodeTaskOverload. To reproduce, just run the test in the loop until failure. For that purposes I've used Edit Run Configuration in Intellij IDEA for the test, Modify options -> Repeat -> Until Failure. Also you need to comment this code in com.alipay.sofa.jraft.core.NodeTest#setupNodeTest, so it will be possible then to run the test several times.

StorageOptionsFactory.registerRocksDBTableFormatConfig(GROUP_ID, RocksDBLogStorage.class, StorageOptionsFactory
            .getDefaultRocksDBTableConfig().setBlockCacheSize(256 * SizeUnit.MB));
alievmirza commented 1 month ago

Also I can provide the thread dump https://gist.github.com/alievmirza/07443fd2ae4e3f7db70fadaa06f4cc1c

fengjiachun commented 1 month ago

Thank you, I have reproduced it on my machine.

fengjiachun commented 1 month ago

The producer thread who handle the com.alipay.sofa.jraft.core.NodeImpl.executeApplyingTasks(NodeImpl.java:1409) method owned the write lock

image

And then the consumer thread try to get the same write lock:

image

Deadlock occurred like this, due to the disruptor overload.

killme2008 commented 1 month ago

Good catch! It's a known bug. Related issues #138 and #720 have been addressed and a fix has been attempted at https://github.com/sofastack/sofa-jraft/pull/764.

It seems that we still have some work to do.

alievmirza commented 3 weeks ago

Do you think that this issue is quite dangerous, or in the real scenarios it is quite rare? Do you have any plans or ideas how to properly fix the issue?

killme2008 commented 3 weeks ago

Do you think that this issue is quite dangerous, or in the real scenarios it is quite rare? Do you have any plans or ideas how to properly fix the issue?

In real scenarios, it is quite uncommon. It occurs only when the node is overloaded, and in such circumstances, manual intervention like adding new nodes or resources may be the only viable solution.

How can it be fixed? I have not considered it carefully. One potential solution is to replace some lock places on Node#writeLock with a tryLock, enabling other tasks to progress.

killme2008 commented 3 weeks ago

I tried a fix at #1109, which makes it fail fast when log manager is overloaded while applying tasks.