google / marl

A hybrid thread / fiber task scheduler written in C++ 11
Apache License 2.0
1.84k stars 191 forks source link

[Question] Waiting for multiple events to be signaled to protect a critical section #244

Open Kojox opened 1 year ago

Kojox commented 1 year ago

In my project I have multiple resources (meaning POD) which should not be modified concurrently. So each task should wait until a specific set of resources it needs is available, lock them all at once, perform some work and unlock them again.

With the use case of only one resource it is easy to do, since I can just wait for an marl::Event and defer the events signal once the task has finished modifying the resource.

Now with multiple resources there is the danger of running into a deadlock if two tasks require the same resources and already acquired the locks for a subset of the required resources without releasing them in a way one tasks gets all resources at once. Example: Task A and Task B both need Resource A and Resource B. Task A gets Resource A -> Task B gets Resource B -> Task A wants Resource B and Task B wants Resource A.

At first I was looking for something like marl::Event::all similar to marl::Event::any but there doesn't seem to such a thing. Then I used marl::Event::test to free all previously acquired resources (just calling marl::Event::signal) if the test returned false.

At first this seemed to work but strangely after a while the frame time gets worse and worse. This happens only when the code below is included. In this case two such tasks are scheduled every frame and as work they just count through a for loop (not even accessing the resources). The only difference which results in ever increasing frame time is if they try to acquire all resources like below. Using the Tracy profiler it looks like the time between the tasks being executed by the scheduler increases.

I wonder if there is a better way to do this with marl or if there are still some errors with this code.

Here is the code I currently use (just rewritten for a fixed subset of two resources):

auto task = [=] {
defer(velocityResourceAvailable.signal());
defer(positionResourceAvailable.signal());
{
    bool allResourcesAcquired = false;
    std::vector<marl::Event> requiredResources = { positionResourceAvailable , velocityResourceAvailable };

    while (allResourcesAcquired == false)
    {
        marl::Event::any(requiredResources.begin(), requiredResources.end()).wait();
        if (positionResourceAvailable.test())
        {
            if (velocityResourceAvailable.test())
            {
                allResourcesAcquired = true;
            }
            else
            {
                positionResourceAvailable.signal();
            }
        }
    }

    // do work ...
Kojox commented 1 year ago

Ok, it turned out that the repeated construction of marl::Event::any caused the increased frame time. Pulling it out of the task solved it.

But I still would like to get some feedback if there is a better solution to this problem.

ben-clayton commented 1 year ago

Hi @Kojox,

Sorry for the slow reply. I've been busy with other things.

I don't know if this would work for you, but maybe a marl::WaitGroup could be used?

If resources can be taken at any point by another task, then you may need a new overload of WaitGroup::wait() that calls a callback function with the mutex held - otherwise between the time the function returns and the resources are used, one of the resources could have been taken again.

This solution does require a separate WaitGroup per set of resources. I don't know if that makes the approach unviable for you.

Cheers, Ben