dotnet / dotNext

Next generation API for .NET
https://dotnet.github.io/dotNext/
MIT License
1.6k stars 119 forks source link

Unexpected behavior when using AtomicBoolean as a class-level field or in a collection #118

Closed viktorrr closed 2 years ago

viktorrr commented 2 years ago

Hello. I ran into a strange/unexpected problem and was hoping to get some guidance whether this is a library bug, async/struct-related limitation, or a bug on my side.

I have a class with 2 fields - AsyncExclusiveLock and the other one is an AtomicBoolean. Those two things represent a lock whose job is to make sure that (1) operations are synchronized and (2) once the lock is acquired, they should execute a given action only if the flag is not set.

I've written a minimal code that reproduces the problem - it appears that storing an AtomicBoolean as a dictionary value or using it as a class-level field produces very unexpected behavior. This is what's demonstrated in "AsyncTest()" - this is how I originally discovered the problem and at first thought the problem is somehow related to async/await + structs (AtomicBoolean). Then, while writing this bug report, I also wrote "NonAsyncTest2()" which shows that the problem is not async/await related. For completeness I also wrote "NonAsyncTest1()" which indicates that AtomicBoolean can not be used in a dictionary value. My guess is that in both cases the problem is the same, it's just a different manifestation.

private static void NonAsyncTest1()
{
    Console.WriteLine("Running NonAsyncTest1()");

    var entityLocks = new ConcurrentDictionary<string, AtomicBoolean>();

    var firstLockAcquisition = entityLocks.GetOrAdd("x", new AtomicBoolean(false));
    if (firstLockAcquisition.FalseToTrue())
    {
        Console.WriteLine("First lock acquisition set false -> true (expected)");
        Console.WriteLine($"firstLockAcquisition = {firstLockAcquisition.Value}");
    }

    var secondLockAcquisition = entityLocks.GetOrAdd("x", new AtomicBoolean(false));
    Console.WriteLine($"secondLockAcquisition = {secondLockAcquisition.Value} (before trying to set false->true)");
    if (secondLockAcquisition.FalseToTrue())
    {
        Console.WriteLine("Second lock acquisition set false -> true (not expected)");
        Console.WriteLine($"firstLockAcquisition = {firstLockAcquisition.Value}");
        Console.WriteLine($"secondLockAcquisition = {secondLockAcquisition.Value}");
    }
    else
    {
        Console.WriteLine("Second lock acquisition could not 'false->true' because the flag (expected); skipping workflow");
        Console.WriteLine($"firstLockAcquisition = {firstLockAcquisition.Value}");
        Console.WriteLine($"secondLockAcquisition = {secondLockAcquisition.Value}");
    }

    Console.WriteLine("---");
}

private static void NonAsyncTest2()
{
    Console.WriteLine("Running NonAsyncTest2()");

    var entityLocks = new ConcurrentDictionary<string, RemoteEntitySyncRoot>();

    var firstLockAcquisition = entityLocks.GetOrAdd("x", new RemoteEntitySyncRoot());
    if (firstLockAcquisition.CancelledOrClosed.FalseToTrue())
    {
        Console.WriteLine("First lock acquisition set false -> true (expected)");
        Console.WriteLine($"firstLockAcquisition = {firstLockAcquisition.CancelledOrClosed}");
    }

    var secondLockAcquisition = entityLocks.GetOrAdd("x", new RemoteEntitySyncRoot());
    Console.WriteLine($"secondLockAcquisition = {secondLockAcquisition.CancelledOrClosed} (before trying to set false->true)");
    if (secondLockAcquisition.CancelledOrClosed.FalseToTrue())
    {
        Console.WriteLine("Second lock acquisition set false -> true (not expected)");
        Console.WriteLine($"firstLockAcquisition = {firstLockAcquisition.CancelledOrClosed}");
        Console.WriteLine($"firstLockAcquisition = {secondLockAcquisition.CancelledOrClosed}");
    }
    else
    {
        Console.WriteLine("Second lock acquisition could not 'false->true' because the flag (expected); skipping workflow");
        Console.WriteLine($"firstLockAcquisition = {firstLockAcquisition.CancelledOrClosed}");
        Console.WriteLine($"firstLockAcquisition = {secondLockAcquisition.CancelledOrClosed}");
    }

    Console.WriteLine("---");
}

private static async Task AsyncTest()
{
    Console.WriteLine("Running AsyncTest()");

    var entityLocks = new ConcurrentDictionary<string, RemoteEntitySyncRoot>();

    var firstLockAcquisition = entityLocks.GetOrAdd("x", new RemoteEntitySyncRoot());

    await firstLockAcquisition.Lock.AcquireAsync();
    try
    {
        if (firstLockAcquisition.CancelledOrClosed.FalseToTrue())
        {
            Console.WriteLine("Successfully set false -> true (expected)");
            await Task.Delay(100);
            Console.WriteLine($"firstLockAcquisition = {firstLockAcquisition.CancelledOrClosed}");
        }
    }
    finally
    {
        firstLockAcquisition.Lock.Release();
    }

    var secondLockAcquisition = entityLocks.GetOrAdd("x", new RemoteEntitySyncRoot());
    Console.WriteLine($"secondLockAcquisition = {secondLockAcquisition.CancelledOrClosed} (before trying to set false->true)");

    await secondLockAcquisition.Lock.AcquireAsync();
    try
    {
        if (secondLockAcquisition.CancelledOrClosed.FalseToTrue())
        {
            Console.WriteLine("Successfully set false -> true (not expected)");
            Console.WriteLine($"firstLockAcquisition = {firstLockAcquisition.CancelledOrClosed}");
            Console.WriteLine($"firstLockAcquisition = {secondLockAcquisition.CancelledOrClosed}");
        }
        else
        {
            Console.WriteLine("Failed set false -> true (expected); skipping workflow");
            Console.WriteLine($"firstLockAcquisition = {firstLockAcquisition.CancelledOrClosed}");
            Console.WriteLine($"firstLockAcquisition = {secondLockAcquisition.CancelledOrClosed}");
        }
    }
    finally
    {
        secondLockAcquisition.Lock.Release();
    }

    Console.WriteLine("---");
}

Running these 3 methods produces:

Running NonAsyncTest1()
First lock acquisition set false -> true (expected)
firstLockAcquisition = True
secondLockAcquisition = False (before trying to set false->true)
Second lock acquisition set false -> true (not expected)
firstLockAcquisition = True
secondLockAcquisition = True
---
Running NonAsyncTest2()
First lock acquisition set false -> true (expected)
firstLockAcquisition = False
secondLockAcquisition = False (before trying to set false->true)
Second lock acquisition set false -> true (not expected)
firstLockAcquisition = False
firstLockAcquisition = False
---
Running AsyncTest()
Successfully set false -> true (expected)
firstLockAcquisition = False
secondLockAcquisition = False (before trying to set false->true)
Successfully set false -> true (not expected)
firstLockAcquisition = False
firstLockAcquisition = False
---

This brings the following question: is my intended usage (RemoteEntitySyncRoot) "illegal"? Are my assumptions wrong? Note: I've used 4.6.0 while testing.

sakno commented 2 years ago

Hi @viktorrr . There is a bug in your code. The following line of code is incorrect

var firstLockAcquisition = entityLocks.GetOrAdd("x", new AtomicBoolean(false));

It produces a copy of AtomicBoolean from the dictionary because AtomicBoolean is a value type, not reference type (class). Thus, you making changes in a copy stored in firstLockAcquisition variable on the stack, but the original value stored in the dictionary remains unchanged.

Regular dictionary has special method GetValueRefOrAddDefault to obtain a reference to a stored value (byref). Concurrent dictionary has no such a luxury.

The only way is to store boxed version AtomicBoolean value type inside of concurrent dictionary. You can use existing container from .NET or BoxedValue<T> from DotNext.

viktorrr commented 2 years ago

Thanks @sakno, much appreciated! I figured it was somehow struct-related, but I wasn't able to paint the whole picture in my head. Your explanation makes sense for NonAsyncTest1(). I suppose that this explains why AsyncTest() is also failing. I switched from public AtomicBoolean CancelledOrClosed { get; } = new(false); to public BoxedValue<AtomicBoolean> CancelledOrClosed { get; } = BoxedValue<AtomicBoolean>.Box(new AtomicBoolean(false)); and verified that the problem does go away.

Do you think it'd be a good idea to have this behavior documented for Atomic* structs? I understand that it is my fault for not knowing/realizing this - after all it isn't related to DotNext.Threading, but more of a general .NET concept - but having a hint about this problem in the documentation would be really appreciated. I guess you can imagine what was my intendend usage and how I fell into this trap.

Thank you for your time.

sakno commented 2 years ago

As you mentioned, this is intended behavior in .NET applicable to all structs. In this case, I have to place a hint to all structs in the library.

viktorrr commented 2 years ago

Ok, I see - not really practical, indeed. Have a nice day. :)