axuno / SmartFormat

A lightweight text templating library written in C# which can be a drop-in replacement for string.Format
Other
1.09k stars 105 forks source link

Concurrency issues with ReflectionSource #210

Closed ghost closed 2 years ago

ghost commented 3 years ago

Smart.Format version: 3.0.0.0

Framework version: net5.0 (Stride3D)

Source code example: Don't have a minimal repro yet, but my repo is pretty consistent. Just run it normally and it should throw consistently.

Current Result: I get an InvalidOperationException because of concurrent access of a non-concurrent collection

Expected result: The collection access is fine and works properly

Exception details: Inner exception

System.InvalidOperationException: Operations that change non-concurrent collections must have exclusive access. A concurrent update was performed on this collection and corrupted its state. The collection's state is no longer correct.
   at System.Collections.Generic.Dictionary`2.TryInsert(TKey key, TValue value, InsertionBehavior behavior)
   at System.Collections.Generic.Dictionary`2.set_Item(TKey key, TValue value)
   at SmartFormat.Extensions.ReflectionSource.TryEvaluateSelector(ISelectorInfo selectorInfo)
   at SmartFormat.SmartFormatter.InvokeSourceExtensions(FormattingInfo formattingInfo)
   at SmartFormat.SmartFormatter.EvaluateSelectors(FormattingInfo formattingInfo)
   at SmartFormat.SmartFormatter.Format(FormattingInfo formattingInfo)

Outer exception

SmartFormat.Core.Formatting.FormattingException: Error parsing format string: Operations that change non-concurrent collections must have exclusive access. A concurrent update was performed on this collection and corrupted its state. The collection's state is no longer correct. at 21
BackgroundTask "{Name}" Id {Id} ({Progress:p0})
---------------------^
   at SmartFormat.SmartFormatter.FormatError(FormatItem errorItem, Exception innerException, Int32 startIndex, IFormattingInfo formattingInfo)
   at SmartFormat.SmartFormatter.Format(FormattingInfo formattingInfo)
   at SmartFormat.SmartFormatter.Format(FormatDetails formatDetails, Object current)
   at SmartFormat.SmartFormatter.Format(IFormatProvider provider, Format format, IList`1 args)
   at SmartFormat.SmartFormatter.Format(Format format, Object[] args)
   at StrideSaber.Diagnostics.BackgroundTask.ToString(Format format) in C:\Users\Rowan\Documents\Projects\Stride3D\StrideSaber\StrideSaber\Diagnostics\BackgroundTask.cs:line 300
   at StrideSaber.Diagnostics.BackgroundTask.ToString() in C:\Users\Rowan\Documents\Projects\Stride3D\StrideSaber\StrideSaber\Diagnostics\BackgroundTask.cs:line 286
   at Serilog.Capturing.PropertyValueConverter.CreatePropertyValue(Object value, Destructuring destructuring, Int32 depth)
   at Serilog.Capturing.PropertyValueConverter.CreatePropertyValue(Object value, Destructuring destructuring)

My throwing code:

return Smart.Default.Format(format, this); //format = Smart.Default.Parser.ParseFormat("BackgroundTask \"{Name}\" Id {Id} ({Progress:p0})");

My Assumptions: I believe it's the access of ReflectionSource._typeCache in TryEvaluateSelector(ISelectorInfo selectorInfo) as it's the only dictionary access I can find in the call chain. What if you just switch to a ConcurrentDicionary instead?

ghost commented 3 years ago

How to reproduce: I don't know if this will work as I'm pulling this code out of context sorry

BackgroundTask.cs

using ConcurrentCollections;
using JetBrains.Annotations;
using Serilog;
using SmartFormat;
using SmartFormat.Core.Parsing;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Runtime.CompilerServices;
using System.Threading.Tasks;
using static StrideSaber.Diagnostics.BackgroundTaskEvent;

namespace StrideSaber.Diagnostics
{
    //TODO: IDisposable (or async)
    /// <summary>
    /// A task-like type that can be used to create tasks whose progress can be tracked and displayed to the user
    /// </summary>
    public sealed class BackgroundTask
    {
        /// <summary>
        /// What event types messages should be logged for
        /// </summary>
        public static BackgroundTaskEvent EnabledLogEvents { get; set; } = Error | Success | Created | Disposed | ProgressUpdated;

        /// <summary>
        /// An object that can be locked upon for global (static) synchronisation
        /// </summary>
        private static readonly object GlobalLock = new();

        /// <summary>
        /// A <see cref="IReadOnlyCollection{T}">collection</see> that encompasses all the currently running background task instances
        /// </summary>
        /// <remarks>This value should be considered mutable and may change while being accessed, so should not be accessed directly</remarks>
        private static readonly ConcurrentHashSet<BackgroundTask> Instances = new();

        /// <summary>
        /// A <see cref="IReadOnlyCollection{T}">collection</see> that encompasses all the currently running background task instances
        /// </summary>
        /// <remarks>
        /// This collection is guaranteed to not be mutated internally
        /// </remarks>
        public static IReadOnlyCollection<BackgroundTask> CloneAllInstances()
        {
            lock (GlobalLock)
            {
                return Instances.ToArray();
            }
        }

        /// <inheritdoc cref="Instances"/>
        public static IReadOnlyCollection<BackgroundTask> UnsafeInstances
        {
            get
            {
                lock (GlobalLock)
                {
                    return Instances;
                }
            }
        }

        /// <summary>
        /// Gets the id of this instance
        /// </summary>
        public Guid Id { get; init; }

        private float progress;

        /// <summary>
        /// How far the task has progressed
        /// </summary>
        /// <remarks> 0 means 'just started', as no progress has been made, and 1 means 'complete', as all the operations have been executed. Values outside of this range should </remarks>
        [ValueRange(0, 1)]
        public float Progress
        {
            get
            {
                //Nifty way to return the progress while clamping it at the same time
                return progress = progress switch
                {
                        < 0 => 0,
                        > 1 => 1,
                        _   => progress
                };
            }
            private set
            {
                //Due to some issues with floating point approximation, I've decided to ignore throwing and just internally clamp
                value = Math.Clamp(value, 0, 1);
                // if (value is < 0 or >1)
                // throw new ArgumentOutOfRangeException(nameof(value), value, "The given value must be in the range of [0..1] (inclusive)");
                progress = value;
            }
        }

        /// <summary>
        /// The name of this <see cref="BackgroundTask"/> instance
        /// </summary>
        public string Name { get; }

        /// <summary>
        /// Constructs a new <see cref="BackgroundTask"/>, with a specified <paramref name="name"/>
        /// </summary>
        /// <param name="name">The name of this <see cref="BackgroundTask"/></param>
        /// <param name="taskDelegate">The <see cref="BackgroundTaskDelegate"/> function to be executed</param>
        public BackgroundTask(string name, BackgroundTaskDelegate taskDelegate)
        {
            Name = name;
            //awaiter = new BackgroundTaskAwaiter(this);
            AddThis();
            TaskDelegate = taskDelegate;
            Task = Task.Run(TaskRunInternal);
            Id = GetNextId();
            RaiseEvent(Created);
        }

        private static Guid GetNextId()
        {
            lock (TaskCounterBytes) //Thread safety
            {
                //Loop over the bytes, in the order we need to increment them for the GUID to look nice
                foreach (int byteIndex in GuidByteOrder)
                {
                    //The byte that we're going to increment
                    ref byte b = ref TaskCounterBytes[byteIndex];
                    //If the byte isn't max (255), then we can safely increase it without overflowing
                    if (b != byte.MaxValue)
                    {
                        b++;   //Increment it
                        break; //And break out of the loop (so we don't modify any more bytes)
                    }

                    //else //Byte is max (255)
                    {
                        b = 0; //Set the byte to 0
                        //continue; //And move on to the next byte (try increment it next loop)
                    }
                }

                //Bytes are incremented nicely, return them as a GUID
                return new Guid(TaskCounterBytes);
            }
        }

        /// <summary>
        /// The order that bytes in a <see cref="Guid"/> are read from an array
        /// </summary>
        private static readonly int[] GuidByteOrder = { 15, 14, 13, 12, 11, 10, 9, 8, 6, 7, 4, 5, 0, 1, 2, 3 };

        /// <summary>
        /// 
        /// </summary>
        private static readonly byte[] TaskCounterBytes = new byte[16];

        private async Task TaskRunInternal()
        {
            UpdateThisInstanceProgress(0);
            try
            {
                //Call the user task delegate
                await TaskDelegate(
                        //Essentially, evey time the user calls updateProgress (the parameter)
                        //We update our property
                        UpdateThisInstanceProgress
                );
                RaiseEvent(Success);
            }
            catch (Exception e)
            {
                //Set the exception that was thrown
                Task = Task.FromException(e);
                RaiseEvent(Error);
            }
            finally
            {
                UpdateThisInstanceProgress(1);
                //Now mark as disposed because the user work has completed
                Dispose();
            }
        }

        // ReSharper disable once InconsistentNaming
        //I purposefully want the name to be wrong so I don't get confused
        private void UpdateThisInstanceProgress(float _progress)
        {
            if (isDisposed) throw new ObjectDisposedException(ToString());
            RaiseEvent(ProgressUpdated);
            Progress = _progress;
        }

        /// <summary>
        /// The <see cref="System.Threading.Tasks.Task"/> that is associated with the current instance
        /// </summary>
        public Task Task { get; private set; }

        /// <summary>
        /// The <see cref="BackgroundTaskDelegate"/> that this instance is running
        /// </summary>
        public BackgroundTaskDelegate TaskDelegate { get; private set; }

        /// <summary>
        /// Returns the awaiter for this instance
        /// </summary>
        [PublicAPI]
        public TaskAwaiter GetAwaiter()
        {
            return Task.GetAwaiter();
        }

        // /// <summary>
        // /// The cached awaiter for this instance
        // /// </summary>
        // private readonly BackgroundTaskAwaiter awaiter;

        private bool isDisposed = false;

        private void Dispose()
        {
            if (isDisposed) return;
            isDisposed = true;
            RaiseEvent(Disposed);
            RemoveThis();
            Task = null!;
        }

    #region Helper

        private void RemoveThis()
        {
            lock (GlobalLock)
            {
                Instances.TryRemove(this);
            }
        }

        private void AddThis()
        {
            lock (GlobalLock)
            {
                Instances.Add(this);
            }
        }

        private void RaiseEvent(BackgroundTaskEvent evt)
        {
            //Also need to check if the flag is `none` because otherwise the second `if` will not return when we want it to
            //If the flag is not enabled, do nothing
            if ((EnabledLogEvents & evt) == 0) return;
            switch (evt)
            {
                case Created:
                    Log.Verbose("{Task} created", this);
                    break;
                case Disposed:
                    Log.Verbose("{Task} disposed", this);
                    break;
                case Error:
                    Log.Warning(Task.Exception, "{Task} threw exception", this);
                    break;
                case Success:
                    Log.Verbose("{Task} completed successfully", this);
                    break;
                case ProgressUpdated:
                    Log.Verbose("{Task} progress update", this);
                    break;
                case None:
                    break;
                default:
                    throw new ArgumentOutOfRangeException(nameof(evt), evt, null);
            }
        }

    #endregion

    #region ToString()

        /// <summary>
        /// A cached <see cref="Format"/> for default <see cref="ToString()"/> behaviour
        /// </summary>
        private static readonly Format DefaultToStringFormat = Smart.Default.Parser.ParseFormat("BackgroundTask \"{Name}\" Id {Id} ({Progress:p0})");

        /// <inheritdoc />
        public override string ToString()
        {
            return ToString(DefaultToStringFormat);
        }

        /// <inheritdoc cref="ToString()"/>
        /// <remarks>Uses <see cref="SmartFormat"/> format strings</remarks>
        public string ToString(string format)
        {
            return ToString(Smart.Default.Parser.ParseFormat(format));
        }

        /// <inheritdoc cref="ToString()"/>
        /// <remarks>Uses <see cref="SmartFormat"/> format strings</remarks>
        public string ToString(Format format)
        {
            return Smart.Default.Format(format, this);
        }

    #endregion
    }

    // /// <summary>
    // ReSharper disable CommentTypo
    // /// An awaiter for the <see cref="BackgroundTask"/> type, allowing use of the <see langword="await"/> keyword
    // /// </summary>
    // public readonly struct BackgroundTaskAwaiter : INotifyCompletion
    // {
    //  private readonly BackgroundTask instance;
    //
    //  /// <summary>
    //  /// Constructs a new <see cref="BackgroundTaskAwaiter"/> for the <see cref="BackgroundTask"/> <paramref name="instance"/>
    //  /// </summary>
    //  /// <param name="instance">The <see cref="BackgroundTask"/> to create the awaiter for</param>
    //  public BackgroundTaskAwaiter(BackgroundTask instance)
    //  {
    //      this.instance = instance;
    //  }
    //
    //  /// <inheritdoc />
    //  public void OnCompleted(Action continuation)
    //  {
    //      Task.Run(continuation);
    //  }
    //
    //  /// <summary>
    //  /// Gets the result for this awaitable instance
    //  /// </summary>
    //  /// <remarks>Does nothing under the hood (empty method body)</remarks>
    //  [PublicAPI]
    //  #pragma warning disable CA1822
    //  public void GetResult()
    //  {
    //  }
    //  #pragma warning restore CA1822
    //
    //  // ReSharper disable once CompareOfFloatsByEqualityOperator
    //  /// <summary>
    //  /// Gets whether the <see cref="BackgroundTask"/> has completed
    //  /// </summary>
    //  [PublicAPI]
    //  public bool IsCompleted => instance.Progress == 1f;
    // }
    // ReSharper restore CommentTypo
}

Task Creator

                private static Random r = new();

        [SuppressMessage("ReSharper", "All")]
        private static async Task AsyncTaskCreator()
        {
            _ = new BackgroundTask("Fps", FpsTask) { Id = Guid.Empty };
            int i = 0;
            while (true)
            {
                int delay = r.Next(1000, 7000);
                await Task.Delay(delay);
                if (BackgroundTask.UnsafeInstances.Count < 10)
                    _ = new BackgroundTask($"Test task {++i}", AsyncTaskTest);
            }
        }

        [SuppressMessage("ReSharper", "All")]
        private static async Task FpsTask(Action<float> updateProgress)
        {
            while (true)
            {
                updateProgress(StrideSaberApp.CurrentGame.UpdateTime.FramePerSecond / 100);
                await Task.Delay(1);
            }
        }

        [SuppressMessage("ReSharper", "All")]
        private static async Task AsyncTaskTest(Action<float> updateProgress)
        {
            DateTime start = DateTime.Now;
            DateTime end = start + TimeSpan.FromMilliseconds(r.Next(0, 15000));
            while (DateTime.Now < end)
            {
                await Task.Delay(1);
                updateProgress((float)((DateTime.Now - start) / (end - start)));
            }

            updateProgress(1);
        }

Info I just call AsyncTaskCreator() on a background thread, and it spins up tasks on it's own. Because I have logging enabled, messages are logged each time something happens (like updating progress), which after a while will call BackgroundTask.ToString(), calling Smart.Format() internally. When debugging, the exception seems to be thrown at SmartFormatter.cs line 413.

Sidenotes: I did manage to make this error seemingly vanish by locking inside my ToString() method (BackgroundTask.cs:298)

axunonb commented 2 years ago

Thanks - will have a look later this week.

axunonb commented 2 years ago

@karljj1: Have you ever come across Smart.Format thread safety issues? Is locking at https://github.com/EternalClickbait/StrideSaber/blob/0eccacd8d5ad60d176bf7e9f0bec36ff8025b6b6/StrideSaber/Diagnostics/BackgroundTask.cs#L298 the recommended fix, as thread safety currently cannot be guaranteed?

karljj1 commented 2 years ago

Ah we do everything on the main thread. I actually removed some of the thread stuff to improve performance on our version. Yes you will likely need some lock or concurrent dictionary.

ghost commented 2 years ago

I have a feeling that by simply changing _typeCache from a normal Dictionary to a ConcurrentDictionary should probably do the trick: Line to change

//OLD:
        private readonly Dictionary<(Type, string?), (FieldInfo? field, MethodInfo? method)> _typeCache = new();

//NEW:
        private readonly ConcurrentDictionary<(Type, string?), (FieldInfo? field, MethodInfo? method)> _typeCache = new();

Resulting ReflectionSource.cs:

//
// Copyright (C) axuno gGmbH, Scott Rippey, Bernhard Millauer and other contributors.
// Licensed under the MIT license.
//

using System;
using System.Collections.Generic;
using System.Linq;
using System.Reflection;
using SmartFormat.Core.Extensions;
using System.Collections.Concurrent;

namespace SmartFormat.Extensions
{
    /// <summary>
    /// Class to evaluate any <see cref="object"/> using <see cref="System.Reflection"/>.
    /// A type cache is used in order to reduce reflection calls.
    /// Include this source, if any of these types shall be used.
    /// </summary>
    public class ReflectionSource : Source
    {
        private static readonly object[] Empty = Array.Empty<object>();

        private readonly ConcurrentDictionary<(Type, string?), (FieldInfo? field, MethodInfo? method)> _typeCache = new();

        /// <inheritdoc />
        public override bool TryEvaluateSelector(ISelectorInfo selectorInfo)
        {
            const BindingFlags bindingFlags = BindingFlags.Instance | BindingFlags.Static | BindingFlags.Public;
            var current = selectorInfo.CurrentValue;

            if (current is null && HasNullableOperator(selectorInfo))
            {
                selectorInfo.Result = null;
                return true;
            }

            // strings are processed by StringSource
            if (current is null or string) return false; 

            var selector = selectorInfo.SelectorText;

            var sourceType = current.GetType();

            // Check the type cache
            if (_typeCache.TryGetValue((sourceType, selector), out var found))
            {
                if (found.field != null)
                {
                    selectorInfo.Result = found.field.GetValue(current);
                    return true;
                }

                if (found.method != null)
                {
                    selectorInfo.Result = found.method.Invoke(current, Empty);
                    return true;
                }

                return false;
            }

            // Important:
            // GetMembers (opposite to GetMember!) returns all members, 
            // both those defined by the type represented by the current T:System.Type object 
            // AS WELL AS those inherited from its base types.
            var mn = sourceType.GetMembers(bindingFlags).Select(m => m.Name);
            var members = sourceType.GetMembers(bindingFlags).Where(m =>
                string.Equals(m.Name, selector, selectorInfo.FormatDetails.Settings.GetCaseSensitivityComparison()));
            foreach (var member in members)
                switch (member.MemberType)
                {
                    case MemberTypes.Field:
                        //  Selector is a Field; retrieve the value:
                        var field = (FieldInfo) member;
                        selectorInfo.Result = field.GetValue(current);
                        _typeCache[(sourceType, selector)] = (field, null);
                        return true;
                    case MemberTypes.Property:
                    case MemberTypes.Method:
                        MethodInfo? method;
                        if (member.MemberType == MemberTypes.Property)
                        {
                            //  Selector is a Property
                            var prop = (PropertyInfo) member;
                            //  Make sure the property is not WriteOnly:
                            if (prop != null && prop.CanRead)
                                method = prop.GetGetMethod();
                            else
                                continue;
                        }
                        else
                        {
                            //  Selector is a method
                            method = (MethodInfo) member;
                        }

                        //  Check that this method is valid -- it needs to return a value and has to be parameter-less:
                        //  We are only looking for a parameter-less Function/Property:
                        if (method?.GetParameters().Length > 0) continue;

                        //  Make sure that this method is not void! It has to be a Function!
                        if (method?.ReturnType == typeof(void)) continue;

                        // Add to cache
                        _typeCache[(sourceType, selector)] = (null, method);

                        //  Retrieve the Selectors/ParseFormat value:
                        selectorInfo.Result = method?.Invoke(current, Array.Empty<object>());
                        return true;
                }

            // We also cache failures so we don't need to call GetMembers again
            _typeCache[(sourceType, selector)] = (null, null);

            return false;
        }
    }
}
axunonb commented 2 years ago

@EternalClickbait I agree the change would be simple, and it could indeed fix your concurrency issue (although I didn't test it yet). ConcurrentDictionary still has a performance penalty of 20-30% when writing compared to Dictionary under net5.0. Assuming that one thread or synchronizing access is the regular use case, wouldn't it be the better solution if the caller took care of locking, like you already did?

ghost commented 2 years ago

@EternalClickbait I agree the change would be simple, and it could indeed fix your concurrency issue (although I didn't test it yet). ConcurrentDictionary still has a performance penalty of 20-30% when writing compared to Dictionary under net5.0. Assuming that one thread or synchronizing access is the regular use case, wouldn't it be the better solution if the caller took care of locking, like you already did?

I would agree that if it's the regular use case, but in reality I think that multithreaded code is the norm for most devs now (not to mention how a lot of C# and DotNet methods are built to support async). A concurrent dictionary would be more performant than simply locking as you said, but it's not very obvious or intuitive. Really, all I'm doing is slowing down my code so that your dictionary doesn't get written to multiple times before it has a chance to properly store the values. I would reccomend that you make the change, or at least give some option, such as a different class or some switch. Also, perhaps you should put something in the readme about it being thread-safe (or not), and add some async unit tests.

axunonb commented 2 years ago

Fully agree with your comments about async in general. My point with Smart.Format was, whether one SmartFormatter instance should work, when its methods are invoked concurrently from different threads. What would be the advantage, compared to using different SmartFormatter instances (see below)?

Considerations so far:

  1. I tried to get the code in your repo running, but it failed because 2 dependency projects were missing. A working sample would be helpful.
  2. Performance tests for Dictionary vs. ConcurrentDictionary show, that the impact of a change is very low, given a low number of write operations. That means, it could be implemented, but still would not mean guaranteed thread safety of SmartFormatter instances.
  3. In v3.0.0-alpha.3 there is no way to disable the cache. The next version will have a bool IsCacheEnabled property, which is true by default. No cache - no concurrency issues - as good as no performance impact for the sample code above.
  4. The static Smart.Default, which is used in your repo, creates a SmartFormatter instance with all extensions added. Although this is simple and intuitive, a more tailored instance would be more lightweight, e.g.
    var smart = new SmartFormatter(new SmartSettings{ CaseSensitivity = CaseSensitivityType.CaseSensitive });
    // add what is actually required
    smart.AddExtensions(new ReflectionSource(), new DefaultSource());
    smart.AddExtensions(new DefaultFormatter());
  5. Creating such a tailored SmartFormatter in the BackgroundTask.ToString(Format format) method would also remove any concurrency issues at low costs (even lower, when getting from your ObjectPool).

Makes sense?

Changes in ReflectionSource (PR [#217]) will go into next alpha version.

ghost commented 2 years ago

Ignore this, didn't read your PR and comment properly

  1. Ah apologies, my repo is more of a 'me' project and I havent thought about the dependecies much. Would a built example with debug files be sufficient?
  2. What do you mean about the thread-safety not being garuanteed? Is that because not all of the extensions would be 100% thread-safe? If so, I agree, and I think making it completely foolproof would be a nightmare, but the issue I had is very easy to make happen and easy to fix, so I think that it's worth it.
  3. Well removing the cache removes the access penalty and replaces it with the re-evaluation penalty doesn't it?
  4. Yeah, i could create my own I guess, but I would still have the problem of the ReflectionSource not being thread-safe, so that doesn't really fix anything.
  5. See (4)
ghost commented 2 years ago

So you've implemented the thread-safety in PR #217 which should fix the issues, thanks :). I'm probably going to go with your advice and finetune my formatter so I get the best performance as well. Regarding point (1), yeah, i know it's a bit funky getting the dependencies. The main problem is that I'm using source generators mixed with local projects, running an engine that doesn't support source generators, which fucks messes everything up. If you want, I can give you a debug build with full PDB. Since this is hopefully fixed, I'm assuming we can safely close the issue now?

p.s. where will I be able to get the new fixed version from, and when

axunonb commented 2 years ago

finetune my formatter so I get the best performance as well

You'll love it ;-)

p.s. where will I be able to get the new fixed version from, and when

Estimating 2-4 days for a v3.0.0-alpha.4 being available on NuGet.

ghost commented 2 years ago

Okie thank you very much. Can't wait for the new release <3 Closing this issue as it will soon be fixed

axunonb commented 2 years ago

Release v3.0.0-alpha.4 is published on NuGet.

axunonb commented 2 years ago

v3.0.0-alpha.5 is published now. Any feedback is welcome. @EternalClickbait Regarding this issue please make sure to set SmartSettings.IsThreadSafeMode=true