microsoft / cppwinrt

C++/WinRT
MIT License
1.64k stars 236 forks source link

(maybe) Bug: coroutines leak if never resumed from thread pool suspension #1333

Closed Dramcryx closed 1 year ago

Dramcryx commented 1 year ago

Version

v2.0.201201.7

Summary

While studing how coroutines are implemented in WinRT, I was not sure about a situation when you suspend coroutine and never resume again. The std::coroutine_handle<>::destroy() is controlled by ref counter in promise_type, that means if a coroutine is suspended without being rescheduled or explicitly destroyed, it results in memory leak. What is the best pratctice to avoid these leaks if a coroutine just get lost somewhere in the heap like in this example:

Reproducible example

#define _CRTDBG_MAP_ALLOC
#include <stdlib.h>
#include <crtdbg.h>

#include <thread>

#include <cassert>

#include <atlbase.h>

#include <winrt/windows.foundation.h>
#include <winrt/windows.system.h>

#include <DispatcherQueue.h>

using namespace winrt::Windows::Foundation;
using namespace winrt::Windows::System;

using namespace std::chrono_literals;

// CreateDispatcherQueueController
#pragma comment(lib, "CoreMessaging.lib")

IAsyncAction BrokenAsync()
{
    co_await winrt::resume_background();
    co_await std::suspend_always{}; // if you comment this out, the leak output is empty
}

int main()
{
    CComPtr<ABI::Windows::System::IDispatcherQueueController> mainThreadDispatcher;
    DispatcherQueueOptions options;
    options.dwSize = sizeof(DispatcherQueueOptions);
    options.threadType = DQTYPE_THREAD_CURRENT;
    assert(S_OK == CreateDispatcherQueueController(options, &mainThreadDispatcher.p));

    IDispatcherQueueController winrtController{ mainThreadDispatcher.Detach(), winrt::take_ownership_from_abi };

    auto q = winrtController.DispatcherQueue();
    DWORD thisThreadId = GetCurrentThreadId();

    std::thread t{ [=, &q] {
        BrokenAsync();
        q.TryEnqueue([=] {
            PostThreadMessage(thisThreadId, WM_QUIT, 0, 0);
            }
        );
    } };

    MSG message;
    while (GetMessage(&message, NULL, 0, 0)) {
        TranslateMessage(&message);
        DispatchMessage(&message);
    }

    t.join();

    _CrtDumpMemoryLeaks(); // prints out some leaks

    return 0;
}

Expected behavior

Coroutine frame beign destroyed by some call

Actual behavior

Detected memory leaks!
Dumping objects ->
{157} normal block at 0x00000234BF9948E0, 400 bytes long.
 Data: < *XL     (XL    > E0 2A 58 4C F6 7F 00 00 80 28 58 4C F6 7F 00 00 
Object dump complete.

Additional comments

No response

kennykerr commented 1 year ago

C++/WinRT coroutines are reference-counted so if you don't hold on to a reference then the last reference will be released by the coroutine frame itself upon completion. C++/WinRT coroutines are also hot-start meaning they don't need an outside scheduler to execute.

As your sample touches a plethora of technologies, I suggest you post the question on https://stackoverflow.com/ for more insights.

Dramcryx commented 1 year ago

Okay, let's reduce the example:

IAsyncAction BrokenAsync()
{
    co_await winrt::resume_background();
}

int main()
{
    BrokenAsync();

    _CrtDumpMemoryLeaks();

    return 0;
}

This also leaks because the task handle to the pool is removed w/o calling destroy(). No interruptions. It still leaks because Win32 thread pool removes only the handle itself, but does not call destroy in the end but the app is about ot quit: image Moreover, generic app should not take care for default threadpool to release its' resources manually

AlexBAV commented 1 year ago

Check this one:

IAsyncAction BrokenAsync()
{
    co_await winrt::resume_background();
}

int main()
{
    DWORD thisThreadId = GetCurrentThreadId();

        {
           BrokenAsync();
        } // force destructor to be called before _CrtDumpMemoryLeaks call

    _CrtDumpMemoryLeaks();

    return 0;
}

You are basically calling _CrtDumpMemoryLeaks before the coroutine is destoyed

Dramcryx commented 1 year ago

You are right, I have incorrectly reduced it. Interruption is needed:

IAsyncAction BrokenAsync()
{
    co_await winrt::resume_background();
    co_await std::suspend_always{};
}

int main()
{
    {
        BrokenAsync();
    }

    _CrtDumpMemoryLeaks();

    return 0;
}

Here we stop it and it gets lost

kennykerr commented 1 year ago

Your example still races with completion. You need to coordinate that yourself with something like Async().get().

Dramcryx commented 1 year ago

In my last comment if you suspend it in the end, you'll never reach to the end of get()

Dramcryx commented 1 year ago

My overall point is next. Consider situation where you sent out the IAsyncAction from ui thread to pool using resume_foreground and then immediately quit the application before that action was resumed on threadpool. Since on pool WinRT sends only coroutine_handle, no reference to rhe promise is tracked on the threadpool itself. It results in a leak, because unresumed coroutine still references promise inside frame (i.e. frame still references itself).

kennykerr commented 1 year ago

suspend_always is not meant to be waited on directly. It is meant as a building block for coroutine implementations. The problem is that it never causes any action to resume. So again, your example is faulty rather than the coroutine implementation inside C++/WinRT.

Anyway, the fact that a thread is busy while you quit the application is not unique to coroutines. There is a myriad of ways you can cause that to happen.

Dramcryx commented 1 year ago

But when it comes to classic pplx tasks w/o await, there is no resource problem because everything is referenced within the object

sylveon commented 1 year ago

If you make a classic pplx task hang waiting for something that never comes, it'll leak too.

tim-weis commented 1 year ago

This appears to be merely a concrete instance of the Halting Problem. It is not specific to generalized subroutines (aka "coroutines"), but applies to "regular" subroutines just as well.

Curiously, you are not the first to discover a fundamental truth in computability theory via coroutines.

Dramcryx commented 1 year ago

@tim-weis I don't think it would be an issue if they used a custom thread pool queue that stores objects rather then C structs in Win32 callback. It would allow to destroy frame as a part of callback object.