amibar / SmartThreadPool

A .NET Thread Pool fully implemented in C# with many features
http://www.codeproject.com/Articles/7933/Smart-Thread-Pool
Microsoft Public License
507 stars 182 forks source link

stp.WaitForIdle() does not work (rare) #32

Open vnayak-mdsol opened 2 years ago

vnayak-mdsol commented 2 years ago

SmartThreadPool v2.2.3

This finding is very rare, I've encountered this problem only a couple of times so far. Therefore, It's very hard to reproduce.

On some occasions the WaitForIdle doesn't wait with callbacks on the queue. It seems like the ManualResetEvent was Set() or the underlying state changed somehow when it shouldn't have been.

To simulate this problem uncomment the call to StpChangeState().

The behavior with StpChangeState() uncommented is consistent with what I'm occasionally encountering.

Any idea what may be causing this? Has this problem been reported before?


         using (var stp = new SmartThreadPool(new STPStartInfo
                       { MinWorkerThreads = 0, MaxWorkerThreads = 2, StartSuspended = false }))
            {
                do
                {
                    stp.QueueWorkItem(JobOne, stp);

                    stp.QueueWorkItem(JobTwo, stp);

                    Thread.Sleep(20);

                    // Uncomment this line to simulate the problem
                    // StpChangeState(stp); 

                    stp.WaitForIdle();

                } while (stp.InUseThreads > 0);
            }

            Console.WriteLine("Enter any key to close");
            Console.ReadKey();
        }

        private static void JobOne(SmartThreadPool obj)
        {
            Thread.Sleep(TimeSpan.FromSeconds(10));
            Console.WriteLine(MethodBase.GetCurrentMethod());
        }

        private static void JobTwo(SmartThreadPool obj)
        {
            Thread.Sleep(TimeSpan.FromSeconds(10));
            Console.WriteLine(MethodBase.GetCurrentMethod());
        }

        private static void StpChangeState(SmartThreadPool stp)
        {
            FieldInfo field = stp.GetType().GetField("_isIdleWaitHandle", BindingFlags.NonPublic | BindingFlags.Instance);
            var value = (ManualResetEvent)field?.GetValue(stp);
            value?.Set();
        }
amibar commented 2 years ago

Hi,

I read your example and I don't understand why the stp.WaitForIdle(); should block at all. The work items run 10 seconds each in parallel, you wait for 20 seconds, and then try to wait for idle when the STP is already idle. Why should it block?

Regards, Ami

vnayak-mdsol commented 2 years ago

Hi Ami,

The work items run 10 seconds each in parallel, you wait for 20 milliseconds, and then try to wait for idle when the STP is already idle. Why should it block?

Thread.Sleep(20); ->> 20 milliseconds timeout on the main thread

10 second timeout on the worker thread

     private static void JobOne(SmartThreadPool obj)
        {
            Thread.Sleep(TimeSpan.FromSeconds(10));   
            Console.WriteLine(MethodBase.GetCurrentMethod());
        }
amibar commented 2 years ago

Hi,

Sorry I read the timeout wrong. Anyway, if I rephrase it, what you say is that the stp.WaitForIdle(); doesn't work. I can hardly believe this, because it is used for so much time.

Maybe the issue is with the stp.InUseThreads > 0, it may happen that the stp.WaitForIdle(); completed, but the stp.InUseThreads > 0 is true. It may happen for a brief period, but it is still possible.

What do you think?

Regards, Ami

vnayak-mdsol commented 2 years ago

Ami, Thanks for your reply.

Running the code with the StpChangeState() uncommented out, you will see that the while loop runs forever as the waitingCallbacks count grows much faster than the processed count and the worker threads simply cannot keep up.

Yes, this is very unusual behavior. I've encountered it only a few times over many 100000s of runs. The problem is very serious however, because the while loop never exits.

Do you think the memory issue described in the article is relevant here? https://www.informit.com/articles/article.aspx?p=1409801&seqNum=3 ?

StartTime {6/11/2022 3:26:39 AM}

DateTime.Now {6/11/2022 3:30:04 AM}

stp dump {Amib.Threading.SmartThreadPool} ActiveThreads: 2 Concurrency: 2 InUseThreads: 2 IsIdle: false IsShuttingdown: false MaxThreads: 2 MinThreads: 0 Name: "SmartThreadPool" PerformanceCountersReader: {Amib.Threading.Internal.NullSTPInstancePerformanceCounters} STPStartInfo: {Amib.Threading.STPStartInfo} WIGStartInfo: {Amib.Threading.STPStartInfo} WaitingCallbacks: 11504 _canceledSmartThreadPool: {Amib.Threading.Internal.CanceledWorkItemsGroup} _currentWorkItemsCount: 11506 _inUseWorkerThreads: 2 _isDisposed: false _isIdleWaitHandle: {System.Threading.ManualResetEvent} _isSuspended: false _localPCs: {Amib.Threading.Internal.NullSTPInstancePerformanceCounters} _name: "SmartThreadPool" _onThreadInitialization: null _onThreadTermination: null _shutdown: false _shuttingDownEvent: {System.Threading.ManualResetEvent} _stpStartInfo: {Amib.Threading.STPStartInfo} _threadCounter: 2 _windowsPCs: {Amib.Threading.Internal.NullSTPInstancePerformanceCounters} _workItemsGroups: {Amib.Threading.Internal.SynchronizedDictionary<Amib.Threading.IWorkItemsGroup, Amib.Threading.IWorkItemsGroup>} _workItemsProcessed: 36 _workItemsQueue: {Amib.Threading.Internal.WorkItemsQueue} _workerThreads: {Amib.Threading.Internal.SynchronizedDictionary<System.Threading.Thread, Amib.Threading.SmartThreadPool.ThreadEntry>}

Thanks for your help.

amibar commented 2 years ago

Hi,

Going to heap corruption is too far fetched, I don't even use unsafe.

I rewrote your code below to make it testable. From what I understand, if I will run Issue32() in a loop for a while it will eventually throw the exception that the WaitForIdle failed? Is this correct ?

void Issue32()
{
    bool doneJobOne = false;
    bool doneJobTwo = false;

    void JobOne(SmartThreadPool s)
    {
        Thread.Sleep(TimeSpan.FromSeconds(10));
        doneJobOne = true;
    }

    void JobTwo(SmartThreadPool s)
    {
        Thread.Sleep(TimeSpan.FromSeconds(10));
        doneJobTwo = true;
    }

    using (var stp = new SmartThreadPool(new STPStartInfo
               { MinWorkerThreads = 0, MaxWorkerThreads = 2, StartSuspended = false }))
    {
        stp.QueueWorkItem(JobOne, stp);

        stp.QueueWorkItem(JobTwo, stp);

        Thread.Sleep(20);

        stp.WaitForIdle();

        if (!doneJobOne || !doneJobTwo)
            throw new Exception("WaitForIdle failed");
    }

    Console.WriteLine("WaitForIdle succeeded");
}

Regards, Ami

vnayak-mdsol commented 2 years ago

Ami, Yes, that's correct. Calling Issue32() would throw an exception.

Hi,

Going to heap corruption is too far fetched, I don't even use unsafe.

I rewrote your code below to make it testable. From what I understand, if I will run Issue32() in a loop for a while it will eventually throw the exception that the WaitForIdle failed? Is this correct ?

void Issue32()
{
    bool doneJobOne = false;
    bool doneJobTwo = false;

    void JobOne(SmartThreadPool s)
    {
        Thread.Sleep(TimeSpan.FromSeconds(10));
        doneJobOne = true;
    }

    void JobTwo(SmartThreadPool s)
    {
        Thread.Sleep(TimeSpan.FromSeconds(10));
        doneJobTwo = true;
    }

    using (var stp = new SmartThreadPool(new STPStartInfo
               { MinWorkerThreads = 0, MaxWorkerThreads = 2, StartSuspended = false }))
    {
        stp.QueueWorkItem(JobOne, stp);

        stp.QueueWorkItem(JobTwo, stp);

        Thread.Sleep(20);

        stp.WaitForIdle();

        if (!doneJobOne || !doneJobTwo)
            throw new Exception("WaitForIdle failed");
    }

    Console.WriteLine("WaitForIdle succeeded");
}

Regards, Ami

amibar commented 2 years ago

Which .net framework do you use?

vnayak-mdsol commented 2 years ago

Which .net framework do you use?

.Net Framework 4.7.1

vnayak-mdsol commented 1 year ago

Ami, Question: Can one instance of smartthreadpool effect the manualresetevent on another instance of the smartthreadpool?

we have 21 instances of the STP in our process at any given time with the top level STP instantiating the other 20 instances. Top level STP has maxthreads count set to 20, and the other 20 STPs have maxthreads count set to 8. The top level STP lifetime is the same as the application(singleton), while the other 20 STPs are created/disposed ~every 30 seconds.