csharptest / CSharpTest.Net.Collections

BPlusTree and other collections moved from http://code.google.com/p/csharptest-net
107 stars 48 forks source link

BPlusTree: Deadlock through the Thread Pool #8

Open csharptest opened 9 years ago

csharptest commented 9 years ago

I have encountered some weird deadlock condition: a deadlock through the Thread Pool. This can occur when the maximum number of threads in the pool is limited. The root of the problem is the asynchronous flushing in StorageCache class.

How to reproduce this problem: First of all there should be limited number of threads in ThreadPool. It could be achieved by calling SetMaxThreads in standard ThreadPool or by using custom thread pool implementations (my case).

Suppose we have 10 working threads. All threads are busy. They perform some operations and sometimes call methods on BPlusTree. One of them performs the operation that invokes 'Update' in StorageCache with this line of code:

this._asyncWriteBehind = this._writeBehindFunc.BeginInvoke(null, null);

But at this point we have no free working threads in the pool, so the job is queued.

Then one thread executes TryUpdate on BPlusTree. The writer lock is acquired there. At the same time 9 other threads trying to perform any other operations on BPlusTree and become locked. The update process goes to the final stage where it executes 'Commit' in StorageCache, that in turn executes 'CompleteAsync' with this code:

this._writeBehindFunc.EndInvoke(asyncWriteBehind);

But the delegate is still inside the queue, so this thread is also locked. And we have the situation when all 10 threads are locked.

I see 3 possible solution to this problem:

  1. The thread, that performs 'CompleteAsync' should detect such situation (by checking some flag) and execute the flushing process in synchronous way. This is the best solution.
  2. Increase the number of threads in ThreadPool, which reduce the probability of this situation. But I don't like this way.
  3. Create separate thread inside BPlusTree which will perform flushing process.

I've described only the situation that has been reproduced in my application. There can be other places in the BPlusTree with the same pattern and the same problem.

csharptest commented 9 years ago

Fixed on /dev