pocoproject / poco

The POCO C++ Libraries are powerful cross-platform C++ libraries for building network- and internet-based applications that run on desktop, server, mobile, IoT, and embedded systems.
https://pocoproject.org
Other
8.43k stars 2.17k forks source link

Why does it take the ThreadPool 10+ seconds to shutdown when there is nothing running. #2450

Closed ghstahl closed 9 months ago

ghstahl commented 6 years ago

Expected behavior

ThreadPool's destructor should exist right away when there are no runnables anymore

Actual behavior

ThreadPool is in a self imposed 10-second holding pattern waiting for stuff that is already gone The joinAll is immediatly satisfied, but the ThreadPool destructor doesn't let go until 10 seconds later.

Steps to reproduce the problem

#include "pch.h" // nothing in here
#include <iostream>
#include <chrono>
#include <thread>
#include "Poco/Runnable.h"
#include "Poco/Thread.h"
#include "Poco/ThreadPool.h"

using namespace std;

class Worker : public Poco::Runnable
{
    static bool _shutDown;
public:
    Worker()
    {

    }
    void run()
    {
        thread::id this_id = this_thread::get_id();
        cout << "Start. [Worker: tid:" << this_id << "]" << endl;

        while (!_shutDown)
        {
            std::this_thread::sleep_for(std::chrono::milliseconds(250));
            cout << "Working. [Worker: tid:" << this_id << "]" << endl;

        }
        cout << "Leaving. [Worker: tid:" << this_id << "]" << endl;
    }
    static void Initialize() {
        _shutDown = false;
    }
    static void Shutdown() {
        _shutDown = true;
    }
};
bool Worker::_shutDown = false;
int main()
{
    Worker::Initialize();

    cout << "Hello World!\n";

    Worker worker1; // create worker threads
    Worker worker2;
    Worker::Shutdown();  // workers will come up, and then immediatly be told to go away

    Poco::ThreadPool::defaultPool().start(worker1);
    Poco::ThreadPool::defaultPool().start(worker2);

    Poco::ThreadPool::defaultPool().joinAll();

    system("pause");
}
H:\github\ghstahl\poco.windows.visualstudio\src\x64\Debug>PocoThreadPoolPlay.exe
Hello World!
Start. [Worker: tid:56164]Start. [Worker: tid:68904]
Leaving. [Worker: tid:56164]

Leaving. [Worker: tid:68904]
Press any key to continue . . .  <  I press any key>  
<waits here for 10 seconds when there is no reason to>

POCO version

Latest

Compiler and version

Microsoft Visual Studio Enterprise 2017 Version 15.8.2 VisualStudio.15.Release/15.8.2+28010.2016 Microsoft .NET Framework Version 4.7.03056

Installed Version: Enterprise

Visual C++ 2017 00369-90250-05476-AA291 Microsoft Visual C++ 2017

ADL Tools Service Provider 1.0 This package contains services used by Data Lake tools

Application Insights Tools for Visual Studio Package 8.13.10627.1 Application Insights Tools for Visual Studio

ASP.NET and Web Tools 2017 15.8.05077.0 ASP.NET and Web Tools 2017

ASP.NET Core Razor Language Services 15.8.31590 Provides languages services for ASP.NET Core Razor.

ASP.NET Web Frameworks and Tools 2017 5.2.60618.0 For additional information, visit https://www.asp.net/

Azure App Service Tools v3.0.0 15.8.05023.0 Azure App Service Tools v3.0.0

Azure Data Lake Node 1.0 This package contains the Data Lake integration nodes for Server Explorer.

Azure Data Lake Tools for Visual Studio 2.3.4000.4 Microsoft Azure Data Lake Tools for Visual Studio

Azure Functions and Web Jobs Tools 15.8.05023.0 Azure Functions and Web Jobs Tools

Azure Stream Analytics Tools for Visual Studio 2.3.4000.4 Microsoft Azure Stream Analytics Tools for Visual Studio

C# Tools 2.9.0-beta8-63208-01 C# components used in the IDE. Depending on your project type and settings, a different version of the compiler may be used.

Common Azure Tools 1.10 Provides common services for use by Azure Mobile Services and Microsoft Azure Tools.

Cookiecutter 15.8.18201.1 Provides tools for finding, instantiating and customizing templates in cookiecutter format.

Fabric.DiagnosticEvents 1.0 Fabric Diagnostic Events

GitHub.VisualStudio 2.5.2.2500 A Visual Studio Extension that brings the GitHub Flow into Visual Studio.

JavaScript Language Service 2.0 JavaScript Language Service

JavaScript Project System 2.0 JavaScript Project System

JetBrains ReSharper Ultimate 2018.1 Build 112.0.20180414.70444 JetBrains ReSharper Ultimate package for Microsoft Visual Studio. For more information about ReSharper Ultimate, visit http://www.jetbrains.com/resharper. Copyright © 2018 JetBrains, Inc.

Microsoft Azure HDInsight Azure Node 2.3.4000.4 HDInsight Node under Azure Node

Microsoft Azure Hive Query Language Service 2.3.4000.4 Language service for Hive query

Microsoft Azure Service Fabric Tools for Visual Studio 2.3 Microsoft Azure Service Fabric Tools for Visual Studio

Microsoft Azure Stream Analytics Language Service 2.3.4000.4 Language service for Azure Stream Analytics

Microsoft Azure Stream Analytics Node 1.0 Azure Stream Analytics Node under Azure Node

Microsoft Azure Tools 2.9 Microsoft Azure Tools for Microsoft Visual Studio 2017 - v2.9.10730.2

Microsoft Continuous Delivery Tools for Visual Studio 0.4 Simplifying the configuration of continuous build integration and continuous build delivery from within the Visual Studio IDE.

Microsoft JVM Debugger 1.0 Provides support for connecting the Visual Studio debugger to JDWP compatible Java Virtual Machines

Microsoft Library Manager 1.0 Install client-side libraries easily to any web project

Microsoft MI-Based Debugger 1.0 Provides support for connecting Visual Studio to MI compatible debuggers

Microsoft Visual C++ Wizards 1.0 Microsoft Visual C++ Wizards

Microsoft Visual Studio Tools for Containers 1.1 Develop, run, validate your ASP.NET Core applications in the target environment. F5 your application directly into a container with debugging, or CTRL + F5 to edit & refresh your app without having to rebuild the container.

Microsoft Visual Studio VC Package 1.0 Microsoft Visual Studio VC Package

MLGen Package Extension 1.0 MLGen Package Visual Studio Extension Detailed Info

NuGet Package Manager 4.6.0 NuGet Package Manager in Visual Studio. For more information about NuGet, visit http://docs.nuget.org/.

ProjectServicesPackage Extension 1.0 ProjectServicesPackage Visual Studio Extension Detailed Info

Python 15.8.18201.1 Provides IntelliSense, projects, templates, debugging, interactive windows, and other support for Python developers.

Python - Django support 15.8.18201.1 Provides templates and integration for the Django web framework.

Python - IronPython support 15.8.18201.1 Provides templates and integration for IronPython-based projects.

Python - Profiling support 15.8.18201.1 Profiling support for Python projects.

R Tools for Visual Studio 1.3.40517.1016 Provides project system, R Interactive window, plotting, and more for the R programming language.

ResourcePackage Extension 1.0 ResourcePackage Visual Studio Extension Detailed Info

ResourcePackage Extension 1.0 ResourcePackage Visual Studio Extension Detailed Info

Snapshot Debugging Extension 1.0 Snapshot Debugging Visual Studio Extension Detailed Info

SQL Server Data Tools 15.1.61808.07020 Microsoft SQL Server Data Tools

Syntax Visualizer 1.0 An extension for visualizing Roslyn SyntaxTrees.

Test Adapter for Boost.Test 1.0 Enables Visual Studio's testing tools with unit tests written for Boost.Test. The use terms and Third Party Notices are available in the extension installation directory.

Test Adapter for Google Test 1.0 Enables Visual Studio's testing tools with unit tests written for Google Test. The use terms and Third Party Notices are available in the extension installation directory.

ToolWindowHostedEditor 1.0 Hosting json editor into a tool window

TypeScript Tools 15.8.20801.2001 TypeScript Tools for Microsoft Visual Studio

Visual Basic Tools 2.9.0-beta8-63208-01 Visual Basic components used in the IDE. Depending on your project type and settings, a different version of the compiler may be used.

Visual C++ for Linux Development 1.0.9.27924 Visual C++ for Linux Development

Visual F# Tools 10.2 for F# 4.5 15.8.0.0. Commit Hash: c55dd2c3d618eb93a8d16e503947342b1fa93556. Microsoft Visual F# Tools 10.2 for F# 4.5

Visual Studio Code Debug Adapter Host Package 1.0 Interop layer for hosting Visual Studio Code debug adapters in Visual Studio

Visual Studio Tools for CMake 1.0 Visual Studio Tools for CMake

Visual Studio Tools for Containers 1.0 Visual Studio Tools for Containers

Visual Studio Tools for Universal Windows Apps 15.0.28010.00 The Visual Studio Tools for Universal Windows apps allow you to build a single universal app experience that can reach every device running Windows 10: phone, tablet, PC, and more. It includes the Microsoft Windows 10 Software Development Kit.

Operating system and version

Windows 10

Other relevant information

x64

ghstahl commented 6 years ago

By Contrast, the following will exit immediately when I manage Threads myself.

// PocoThreadPlay.cpp : This file contains the 'main' function. Program execution begins and ends there.
//

#include "pch.h"
#include <iostream>
#include <chrono>
#include <thread>
#include "Poco/Runnable.h"
#include "Poco/Thread.h"

using namespace std;

class Worker : public Poco::Runnable
{
    static bool _shutDown;
public:
    Worker()
    {

    }
    void run()
    {
        thread::id this_id = this_thread::get_id();
        cout << "Start. [Worker: tid:" << this_id << "]" << endl;

        while (!_shutDown)
        {
            std::this_thread::sleep_for(std::chrono::milliseconds(250));
            cout << "Working. [Worker: tid:" << this_id << "]" << endl;

        }
        cout << "Leaving. [Worker: tid:" << this_id << "]" << endl;
    }
    static void Initialize() {
        _shutDown = false;
    }
    static void Shutdown() {
        _shutDown = true;
    }
};
bool Worker::_shutDown = false;
int main()
{
    Worker::Initialize();
    std::cout << "Hello World!\n"; 

    Worker worker1; // create worker threads
    Worker worker2;
    Worker::Shutdown();  // workers will come up, and then immediatly be told to go away

    Poco::Thread thread1;
    thread1.start(worker1);
    Poco::Thread thread2;
    thread2.start(worker2);

    thread1.join();
    thread2.join();

    system("pause");
}
aleks-f commented 6 years ago

I have noticed something similar on windows (Foundation tests hangs for a while before completion) since we moved the develop branch (future 2.0 release) back-end thread implementation to c++11 std threading,events etc. Never had time to investigate, the only place I can see where ThreadPool would be doing it is PooledThread::release(), but I can't see why. If you find out the reason, let us know (or, even better, send a pull).

ghstahl commented 6 years ago

The start of the issues is happening here via ~ThreadPool

void PooledThread::release()
{
    const long JOIN_TIMEOUT = 10000;

    _mutex.lock();
    _pTarget = 0;
    _mutex.unlock();
    // In case of a statically allocated thread pool (such
    // as the default thread pool), Windows may have already
    // terminated the thread before we got here.
    if (_thread.isRunning())
        _targetReady.set();

    if (_thread.tryJoin(JOIN_TIMEOUT))
    {
        delete this;
    }
}

This is one of those things that we need architectural feedback from the authors as to why before we can assert that something needs to be fixed. I can live with an answer like "This ThreadPool is not for you".

aleks-f commented 6 years ago

What we have here is a reproducible hang on one platform (I did not notice it anywhere else except on windows). So, it is obvious it needs to be fixed. If you do not need ThreadPool, that's fine. If you do need it and want to fix it, then find why it hangs and let us know. Otherwise, it will be fixed when someone has enough time to spend on it.

FWIW, I suspect it has to do with new event implementation, which uses std::condition_variable. I did some work on improving new events, but it did not fix the problem and I did not have time to look deeper into it; the contributors who ported threading and events to std have been silent since, so that's where we currently are.

ghstahl commented 6 years ago

@aleks-f I am zeroing in on where the issue is.

The root of the problem is that PoolThread, enters into a wait as shown here;

void PooledThread::run()
{
    _started.set();
    for (;;)
    {
        _targetReady.wait();
        _mutex.lock();
        if (_pTarget) // a NULL target means kill yourself
        {
                         .....
                }
                else
        {
            _mutex.unlock();
            break;
        }
       }
}

This happens after my initial runables are gone and that _pTarget is NULL.

During shutdown, the following gets called for each PooledThread

void PooledThread::release()
{
    const long JOIN_TIMEOUT = 10000;

    _mutex.lock();
    _pTarget = 0;
    _mutex.unlock();
    // In case of a statically allocated thread pool (such
    // as the default thread pool), Windows may have already
    // terminated the thread before we got here.
    if (_thread.isRunning())
        _targetReady.set();

    if (_thread.tryJoin(JOIN_TIMEOUT))
    {
        delete this;
    }
}

Where _targetReady.set(); is called, and what is expected is that the void PooledThread::run() then wakes and goes away. This isn't happening, whereas the thinking it seems is that the release is signalling for a graceful shutdown to happen. The

_targetReady.wait();

doesn't relinquish control.

I need to dust off my mutex knowledge as to why this is not getting signaled. Jump in anytime ;)

aleks-f commented 6 years ago

Probably nothing to do with mutexes. I'd suspect that perhaps the thread is not running when release() is called, _targetReady.set() is not called, and _thread.tryJoin(JOIN_TIMEOUT) actually hangs until the timeout expires.

I'm just guessing, though - best to walk in debugger and see what's really happening.

ghstahl commented 6 years ago

_targetReady.set() is getting called, because the thread is running. The PooledThread is alive and well.

if (_thread.isRunning())
        _targetReady.set();

if (_thread.isRunning()) evaluates to true.

_thread.tryJoin(JOIN_TIMEOUT) is acting like you describe because _targetReady.set() was called but had no effect on the PooledThread shutting down.

Sorry, dust off my event knowledge.

H

ghstahl commented 6 years ago

More Clues. I changed how ThreadPoolSingletonHolder was being created. Instead of an object on the stack, it is now on the heap, which has to be explicitly deleted, which is the very last line of my main();

namespace
{
    static ThreadPoolSingletonHolder* pSh;
}

ThreadPool& ThreadPool::defaultPool(ThreadAffinityPolicy affinityPolicy)
{
    if(pSh == nullptr)
    {
        pSh = new ThreadPoolSingletonHolder();
    }
    return *(pSh->pool(affinityPolicy));
}
void ThreadPool::destructDefaultPool()
{
    delete pSh;
}
int main()
{
    Worker::Initialize();

    cout << "Hello World!\n";

    Worker worker1; // create worker threads
    Worker::Shutdown();  // workers will come up, and then immediatly be told to go away

    Poco::ThreadPool::defaultPool().start(worker1);

    Poco::ThreadPool::defaultPool().joinAll();

    system("pause");
    Poco::ThreadPool::destructDefaultPool();
}

Now it works as expected.

More Experimentation... Lets try a std::unique_ptr

namespace
{
    static std::unique_ptr<ThreadPoolSingletonHolder> sh;
}

ThreadPool& ThreadPool::defaultPool(ThreadAffinityPolicy affinityPolicy)
{
    if(sh == nullptr)
    {
        sh = std::make_unique<ThreadPoolSingletonHolder>();
    }
    ThreadPool* pTp = sh->pool(affinityPolicy);

    return *(pTp);
}

This doesn't work, even though the unique_ptr's destructor fired and deleted the contained ThreadPoolSingletonHolder pointer. There is something about tear-down order happening here.

More Experimentation. Scope & Stack. I created a personal copy of ThreadPoolSingletonHolder and called it MyThreadPoolSingletonHolder.

bool Worker::_shutDown = false;
int main()
{
    cout << "Hello World!\n";
    Worker::Initialize();
    Worker worker1; // create worker threads
    Worker::Shutdown();  // workers will come up, and then immediatly be told to go away
    {// scope
        MyThreadPoolSingletonHolder myThreadPoolSingletonHolder;
        Poco::ThreadPool* pool = myThreadPoolSingletonHolder.pool(Poco::ThreadPool::TAP_DEFAULT);

        pool->start(worker1);

        pool->joinAll();
        system("pause");
    }
}

This works and the shutdown happens right away.

This is where my knowledge of Windows x64 "what the hell is going on during the app being torn down" is hitting a wall.

ghstahl commented 6 years ago

@aleks-f If what I am finding is what I think it is, than I could make the following assertions;

  1. Your approach to the ThreadPool is sound, however app teardown differs on different OS's
  2. Design ThreadPool usage to not care about app tear-down by shutting it all down while the app is still in pre tear-down.

I would expose the the objects so that they can be disposed of prior to app tear down. I.e. require me to scope them as I have done, if C++ had a try.catch.finally, I would destroy the pool there.

I don't know how far this goes, as POCO uses ThreadPool for other stuff.

ghstahl commented 6 years ago

As is, anything that uses the default threadpool will have a shutdown problem on the Windows x64 builds I am testing.
I have a working example of TaskManager that properly shuts down.
As well as a working example of ThreadPool that properly shuts down.

Task manager that goes away on shutdown.

1. Had to make a personal copy of ThreadPoolSingletonHolder called MyThreadPoolSingletonHolder.
2. Made sure it was scoped so that it destructed before the app tear-down phase forced the destruction.

Good thing that TaskManager takes a reference to a ThreadPool as a ctor argument.

MyThreadPoolSingletonHolder myThreadPoolSingletonHolder;
Poco::ThreadPool* pool = myThreadPoolSingletonHolder.pool(Poco::ThreadPool::TAP_DEFAULT);

Poco::TaskManager tm(*pool);
ghstahl commented 6 years ago

So the theory that Events don't get signaled at tear-down, because they don't seem to wake up the PoolThread when the ThreadPool telling each PoolThread to release is false. Proof Project

The reason there is a 10 second hang is that the PoolThread is waiting on an event, which gets set but the PoolThread doesn't wake. Why it works in my little POC, but not in the ThreadPool context. I don't believe this has to do with the condition work you mentioned. Here is the Poco.ThreadPool.Hang project that shows the hang.

I will happily take any suggestions to hunt this down.

github-actions[bot] commented 2 years ago

This issue is stale because it has been open for 365 days with no activity.

github-actions[bot] commented 11 months ago

This issue is stale because it has been open for 365 days with no activity.

matejk commented 11 months ago

Is this solved with #4311, perhaps?

aleks-f commented 10 months ago

@matejk possible, but no time to check now, I'll mark this for 1.13.1

matejk commented 9 months ago

Reported problem was fixed with one of the changes for release 1.13. Added test case for this situation.