dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.55k stars 4.54k forks source link

[API Proposal]: Additional setting for AsyncLocal Value to cascade upwards #99153

Open mrpmorris opened 4 months ago

mrpmorris commented 4 months ago

Background and motivation

This is based on my own misunderstanding of AsyncLocal<T>.

I was expecting AsyncLocal<T> to be a single piece of state that is passed across all classes/methods operating on the same async scope (e.g. a web request all the way down and back up again). This assumption was based on how ThreadLocal<T> works - I assumed AsyncLocal<T> was an equivalent for async code.

I was using this to add a collection of events in my domain classes that propagate upwards to my web request handler to be recorded in the database. I have now come to understand that the value is cascaded down, but not upwards.

So if something down the call chain is the first to access my ThreadLocal<List<Func<object>>> then the value will never make its way back up to the initiator so it can receive the values added and store them in the database.

A workaround is to ensure the TheadLocal<T>.Value is accessed early on in the request - but as this is a library this is not something I can ensure, and a default null value means "No data was added" so I can't guard against null.

Example of incorrect assumption

SomeClass.Sequence = 'A';
await SetSecondValueAsync();
Console.WriteLine($"Step 5={SomeClass.Sequence}");

static async Task SetSecondValueAsync()
{
    Console.WriteLine($"Step 1={SomeClass.Sequence}");
    SomeClass.Sequence = 'B';
    await SetThirdValueAsync();
    Console.WriteLine($"Step 4={SomeClass.Sequence}");
}

static Task SetThirdValueAsync()
{
    Console.WriteLine($"Step 2={SomeClass.Sequence}");
    SomeClass.Sequence = 'C';
    Console.WriteLine($"Step 3={SomeClass.Sequence}");
    return Task.CompletedTask;
}

public static class SomeClass
{
#if UseAsyncLocal
    private static readonly AsyncLocal<char> SequenceHolder = new();
#else
    private static readonly ThreadLocal<char> SequenceHolder = new();
#endif
    public static char Sequence
    {
        get => SequenceHolder.Value; set => SequenceHolder.Value = value;
    }
}

ThreadLocal<T> output

AsyncLocal<T> output

API Proposal

public class AsyncLocal<T>
{
  public AsyncLocal(bool propagateUpwards = false)
  {
    ....
  }
  public AsyncLocal(Action<AsyncLocalValueChangedArgs<T>>? valueChangedHandler, bool propagateUpwards = false)
  {
    ...
  }
  // other code

API Usage

// My static DomainEvents class, using `propagateUpwards = `true`
public static class DomainEvents
{
  private static readonly AsyncLocal<List<Func<object>> Holder = new(propagateUpwards: true);

  public static void Add(Func<object> factory)
  {
    if (Holder.Value is null)
      Holder.Value = new List<Func<object>>();
    Holder.Value.Add(factory);
  }

  public static void GetEventAndClear(out Func<object>[] eventFactories)
  {
    eventFactories = Holder.Value?.ToArray() ?? [];
    Holder.Value.Clear();
  }
}

// Some code executed from a web request
public async Task EntryPoint()
{
  await HandleAsync();
  DomainEvents.GetAndClear(out Func<object>[] eventFactories);
  DbContext.DomainEvents.AddRange(eventFactories.Select(factory => factory.Invoke());
  await DbContext.SaveChangesAsync();
}

private async Task HandleAsync()
{
  Customer customer = await DbContext.Customers.FirstAsync();
  customer.Deactivate();
}

// Code in the Customer entity
public void Deactivate()
{
  Customer.Status = CustomerStatus.Deactivated
  DomainEvents.Add(() => new CustomerDeactivatedEvent());
}

Alternative Designs

Another possibility would be to create an alternative class named something like AsyncStatic<T> that works in the same way as ThreadStatic<T> but across async calls.

Risks

No response

ghost commented 4 months ago

Tagging subscribers to this area: @mangod9 See info in area-owners.md if you want to be subscribed.

Issue Details
### Background and motivation This is based on my own misunderstanding of `AsyncLocal`. I was expecting `AsyncLocal` to be a single piece of state that is passed across all classes/methods operating on the same async scope (e.g. a web request all the way down and back up again). This assumption was based on how `ThreadLocal` works - I assumed `AsyncLocal` was an equivalent for `async` code. I was using this to add a collection of events in my domain classes that propagate upwards to my web request handler to be recorded in the database. I have now come to understand that the value is cascaded down, but not upwards. So if something down the call chain is the first to access my `ThreadLocal>>` then the value will never make its way back up to the initiator so it can receive the values added and store them in the database. A workaround is to ensure the `TheadLocal.Value` is accessed early on in the request - but as this is a library this is not something I can ensure, and a default `null` value means "No data was added" so I can't guard against `null`. ### API Proposal ```csharp public class AsyncLocal { public AsyncLocal(bool propagateUpwards = false) { .... } public AsyncLocal(Action>? valueChangedHandler, bool propagateUpwards = false) { ... } // other code ``` ### API Usage ```csharp // My static DomainEvents class, using `propagateUpwards = `true` public static class DomainEvents { private static readonly AsyncLocal> Holder = new(propagateUpwards: true); public static void Add(Func factory) { if (Holder.Value is null) Holder.Value = new List>(); Holder.Value.Add(factory); } public static void GetEventAndClear(out Func[] eventFactories) { eventFactories = Holder.Value?.ToArray() ?? []; Holder.Value.Clear(); } } // Some code executed from a web request public async Task EntryPoint() { await HandleAsync(); DomainEvents.GetAndClear(out Func[] eventFactories); DbContext.DomainEvents.AddRange(eventFactories.Select(factory => factory.Invoke()); await DbContext.SaveChangesAsync(); } private async Task HandleAsync() { Customer customer = await DbContext.Customers.FirstAsync(); customer.Deactivate(); } // Code in the Customer entity public void Deactivate() { Customer.Status = CustomerStatus.Deactivated DomainEvents.Add(() => new CustomerDeactivatedEvent()); } ``` ### Alternative Designs Another possibility would be to create an alternative class named something like `BiDirectionalAsyncLocal`. ### Risks _No response_
Author: mrpmorris
Assignees: -
Labels: `api-suggestion`, `area-System.Threading`
Milestone: -
ghost commented 4 months ago

Tagging subscribers to this area: @dotnet/area-system-threading-tasks See info in area-owners.md if you want to be subscribed.

Issue Details
### Background and motivation This is based on my own misunderstanding of `AsyncLocal`. I was expecting `AsyncLocal` to be a single piece of state that is passed across all classes/methods operating on the same async scope (e.g. a web request all the way down and back up again). This assumption was based on how `ThreadLocal` works - I assumed `AsyncLocal` was an equivalent for `async` code. I was using this to add a collection of events in my domain classes that propagate upwards to my web request handler to be recorded in the database. I have now come to understand that the value is cascaded down, but not upwards. So if something down the call chain is the first to access my `ThreadLocal>>` then the value will never make its way back up to the initiator so it can receive the values added and store them in the database. A workaround is to ensure the `TheadLocal.Value` is accessed early on in the request - but as this is a library this is not something I can ensure, and a default `null` value means "No data was added" so I can't guard against `null`. #### Example of incorrect assumption ```csharp SomeClass.Sequence = 'A'; await SetSecondValueAsync(); Console.WriteLine($"Step 5={SomeClass.Sequence}"); static async Task SetSecondValueAsync() { Console.WriteLine($"Step 1={SomeClass.Sequence}"); SomeClass.Sequence = 'B'; await SetThirdValueAsync(); Console.WriteLine($"Step 4={SomeClass.Sequence}"); } static Task SetThirdValueAsync() { Console.WriteLine($"Step 2={SomeClass.Sequence}"); SomeClass.Sequence = 'C'; Console.WriteLine($"Step 3={SomeClass.Sequence}"); return Task.CompletedTask; } public static class SomeClass { #if UseAsyncLocal private static readonly AsyncLocal SequenceHolder = new(); #else private static readonly ThreadLocal SequenceHolder = new(); #endif public static char Sequence { get => SequenceHolder.Value; set => SequenceHolder.Value = value; } } ``` `ThreadLocal` output - Step 1=A - Step 2=B - Step 3=C - Step 4=C - Step 5=C `AsyncLocal` output - Step 1=A - Step 2=B - Step 3=C - Step 4=C - Step 5=A ### API Proposal ```csharp public class AsyncLocal { public AsyncLocal(bool propagateUpwards = false) { .... } public AsyncLocal(Action>? valueChangedHandler, bool propagateUpwards = false) { ... } // other code ``` ### API Usage ```csharp // My static DomainEvents class, using `propagateUpwards = `true` public static class DomainEvents { private static readonly AsyncLocal> Holder = new(propagateUpwards: true); public static void Add(Func factory) { if (Holder.Value is null) Holder.Value = new List>(); Holder.Value.Add(factory); } public static void GetEventAndClear(out Func[] eventFactories) { eventFactories = Holder.Value?.ToArray() ?? []; Holder.Value.Clear(); } } // Some code executed from a web request public async Task EntryPoint() { await HandleAsync(); DomainEvents.GetAndClear(out Func[] eventFactories); DbContext.DomainEvents.AddRange(eventFactories.Select(factory => factory.Invoke()); await DbContext.SaveChangesAsync(); } private async Task HandleAsync() { Customer customer = await DbContext.Customers.FirstAsync(); customer.Deactivate(); } // Code in the Customer entity public void Deactivate() { Customer.Status = CustomerStatus.Deactivated DomainEvents.Add(() => new CustomerDeactivatedEvent()); } ``` ### Alternative Designs Another possibility would be to create an alternative class named something like `BiDirectionalAsyncLocal`. ### Risks _No response_
Author: mrpmorris
Assignees: -
Labels: `api-suggestion`, `area-System.Threading`, `area-System.Threading.Tasks`, `untriaged`
Milestone: -
KalleOlaviNiemitalo commented 4 months ago

@mrpmorris perhaps Microsoft.Extensions.AsyncState and Microsoft.AspNetCore.AsyncState are more suitable for your scenario. These let you propagate values out of async functions, with storage either set up explicitly or tied to the incoming HTTP request.

KalleOlaviNiemitalo commented 4 months ago

Re bool propagateUpwards, it's not clear how it should work in this kind of code

class C {
    AsyncLocal<int> al = new AsyncLocal<int>(propagateUpwards: true);
    AsyncLocal<int> al2 = new AsyncLocal<int>(propagateUpwards: true);

    async Task<int> Inner1() {
        al.Value = 10;
        await Task.Yield();
        al.Value = 11;
        return 12;
    }

    async Task<int> Inner2() {
        al.Value = 20;
        await Task.Yield();
        al.Value = 21;
        return 22;
    }

    async Task Outer() {
        al.Value = 1;
        al2.Value = 101;
        Task<int> t1 = Inner1();
        Console.WriteLine(al.Value); // 1 or 10?

        al.Value = 2;
        Task<int> t2 = Inner2();
        Console.WriteLine(al.Value); // 2 or 20?

        al.Value = 3;
        await Task.WhenAll(t1, t2);
        Console.WriteLine(al.Value); // 3 or 10 or 20?

        al.Value = 4;
        al2.Value = 104;
        await t1;
        Console.WriteLine(al.Value); // 4 or 10?
        Console.WriteLine(al2.Value); // 104 as set above, or did it return to 101?

        al.Value = 5;
        _ = t1.Result;
        Console.WriteLine(al.Value); // 5 or 10?

        al.Value = 6;
        await t1; // second time
        Console.WriteLine(al.Value); // 6 or 10?
    }
}

When should AsyncLocal\<T>.Value propagate to the caller:

  • Immediately when set? Then how does it differ from plain T — surely there must be some boundary through which it does not propagate. Like from one HTTP request to another.
  • Only in await? Then it could even be controllable with ConfigureAwaitOptions.
    • Only in the first await of that Task, or also in subsequent awaits?
  • Also in Task\<T>.Result? I don't think users would expect such a side effect.
  • Also through Task.WhenAll, ContinueWith, and similar?

If the Task returned by the async function needs to carry the modified values, then that could make async functions slower in general, even when this feature is not used. This cost could perhaps be avoided if the upwards propagation were only supported in some new UpPropagatingTask\<T> type, similar to how ValueTask\<T> is an alternative to Task\<T>. But then it would not work through current ASP.NET Core which does not use such a type.

mrpmorris commented 4 months ago

@KalleOlaviNiemitalo My goal was to not have to inject dependencies, so the packages wouldn't give me what I want.

Re your example, the idea is that it is a way of sharing state within async/await chains in the same way ThreadLocal shares within a single thread.

weltkante commented 4 months ago

the idea is that it is a way of sharing state within async/await chains in the same way ThreadLocal shares within a single thread

where would you make the "split" condition, i.e. which code would get its own state? since all code spawns off the programs main method everything would share the same state?

if you say code run on new Thread get its own state, does that include threadpool threads? do threadpool threads get new state for every workload being run on the threadpool? What about Task.Run which is used both to start threadpool workloads and to move existing workloads to the threadpool?

what is with async workloads switching threads across their lifetime? if I await async code on a different thread (having its own AsyncLocal state) would that "merge" the states? that is the "real" issue with the example code posted above. you just have to assume the different workloads come from different states (however you define differing states to be generated)

I have run into your very same problem myself because WPF takes a snapshot of AsyncLocal very early in the startup and never merges state back, so some events will always execute with that startup state. I've thought a lot about how it could be resolved, as you can see from above questions there is not really a single answer that fits everyone, its relatively easy to construct a relevant example where any fixed splitting/merging strategy fails.

I've come to the conclusion, if you want shared state, then put that shared state into the AsyncLocal (and in case of WPF, early in the startup sequence). Make a mutable object and let it propagate through the AsyncLocal infrastructure, but do not change the AsyncLocal (except when you actually want to split off state!). Just mutate your "current" mutable objects state in the AsyncLocal and everyone sharing it will see the changes (you will have to consider thread safety on this shared state).

KalleOlaviNiemitalo commented 4 months ago

WPF takes a snapshot of AsyncLocal very early in the startup and never merges state back, so some events will always execute with that startup state.

I had a related problem with a library that creates threads in unmanaged code and then calls managed-code callbacks on them. .NET Runtime does not know where those threads came from, so they don't inherit any AsyncLocal\<T>.Value, such as those used by NUnit or Microsoft.Extensions.Logging. This can be worked around by using ExecutionContext.Capture during initialisation and then ExecutionContext.CreateCopy and ExecutionContext.Run within each callback.

Just mutate your "current" mutable objects state in the AsyncLocal and everyone sharing it will see the changes (you will have to consider thread safety on this shared state).

This is the approach taken in Microsoft.Extensions.AsyncState. That package however is less thread-safe than one would expect when porting from AsyncLocal\<T>; https://github.com/dotnet/extensions/issues/4623.

Having AsyncLocal\<T>.Value refer to a mutable instance also lets you clear out the reference fields of that instance when you finish with the async operation. If a library started any long-running task or async function that inherited the ExecutionContext, then that will prevent the garbage collector from freeing the AsyncLocal\<T>.Value, even if the library will never read that; but if you clear the fields, then at least this instance won't keep any other objects alive and consuming memory unnecessarily.

Duranom commented 2 months ago

Think something similar is stated in the in the async2 (meeting)

Though @mrpmorris is the issue you are running into solvable with a work around with a ThreadLocal and AsyncLocal working together and act sort of like the HttpContextAccessor? It a sort of thing used several times in some personal needed a similar behavior for a bit due to a library till could move away from such ambient states.

/// <remarks>
/// Not threading write safe
/// </remarks>
public sealed class State<T>
where T : class
{
    private static AsyncLocal<ContextHolder> _asyncLocal = new(vca => {
        // Don't think you need to check ThreadContextChanged as always should have ContextHolder after fist value
        _threadLocal.Value ??= vca.CurrentValue ?? vca.PreviousValue;
    });

    private static ThreadLocal<ContextHolder> _threadLocal = new();

    public T Value
    {
        get => _threadLocal.Value?.Context;
        set
        {
            if(_asyncLocal.Value is null)
            {
                _asyncLocal.Value = new ContextHolder{ Context = null };
            }
            else
            {
                _asyncLocal.Value = new ContextHolder{ Context = value };
            }
        }
    }

    private class ContextHolder
    {
        public T Context {get; set;}
    }
}