MadsKirkFoged / EngineeringUnits

Working with units made easy with automatic unit-check and converting between units
MIT License
42 stars 11 forks source link

bug: Race Condition when accessing the UnitSystem.CacheMultiply and UnitSystem.CacheDivide dictionaries #17

Closed mgoutrie closed 2 years ago

mgoutrie commented 2 years ago

The access to the caches of the multiplied and divided units is not protected from concurrent access. I had sporadic errors when running unit tests involving unit calculations/unit initializations. I reduced the problem to the followinf unit test which fails with a high probability:

        [Fact]// (Skip = "race condition demonstration")]
        [Exploratory]
        public void ParallelAccess2()
        {
            var unit1 = LengthUnit.Meter;
            var unit2 = DurationUnit.Millisecond;

            Func<UnitSystem> dividedUnit = () =>
            {
                var unit3 = unit1.Unit / unit2.Unit;
                return unit3;
            };

            List<Task> tasks = new List<Task>();
            for (int i = 1; i < 400; ++i)
            {
                tasks.Add(Task.Run(dividedUnit));
            }

            Task.WaitAll(tasks.ToArray());
        }

Observed behavior

System.ArgumentException
An item with the same key has already been added. Key: 1100879750
   at System.Collections.Generic.Dictionary`2.TryInsert(TKey key, TValue value, InsertionBehavior behavior)
   at System.Collections.Generic.Dictionary`2.Add(TKey key, TValue value)
   at EngineeringUnits.UnitSystem.Divide(UnitSystem a, UnitSystem b)
   at EngineeringUnits.UnitSystem.op_Division(UnitSystem left, UnitSystem right)
   at EngineeringUnitsTests.<>c__DisplayClass6_0.<ParallelAccess2>b__0() 
   at System.Threading.Tasks.Task`1.InnerInvoke()
   at System.Threading.Tasks.Task.<>c.<.cctor>b__277_0(Object obj)
   at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(Thread threadPoolThread, ExecutionContext executionContext, ContextCallback callback, Object state)
--- End of stack trace from previous location ---
   at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(Thread threadPoolThread, ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread)

I had another test with random units which showed some different errors, e.g.

System.InvalidOperationException
Operations that change non-concurrent collections must have exclusive access. A concurrent update was performed on this collection and corrupted its state. The collection's state is no longer correct.
   at System.Collections.Generic.Dictionary`2.FindValue(TKey key)
   at System.Collections.Generic.Dictionary`2.TryGetValue(TKey key, TValue& value)
   at EngineeringUnits.UnitSystem.Divide(UnitSystem a, UnitSystem b)
   at EngineeringUnits.UnitSystem.op_Division(UnitSystem left, UnitSystem right)
   at EngineeringUnitsTests.<>c__DisplayClass5_0.<ParallelAccess>b__5() 
   at System.Threading.Tasks.Task`1.InnerInvoke()
   at System.Threading.Tasks.Task.<>c.<.cctor>b__277_0(Object obj)
   at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(Thread threadPoolThread, ExecutionContext executionContext, ContextCallback callback, Object state)
--- End of stack trace from previous location ---
   at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(Thread threadPoolThread, ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread)

Expected behavior: No exception when doing unit calculations in different threads.

MadsKirkFoged commented 2 years ago

Hi mgoutrie,

Thanks for pointing this out! I was not aware that Dictionary was not thread-safe! Just swapped the Dictionary with ConcurrentDictionary and everything seems to work.

Then unittest you created is now working! If you have time I would very much appreciated if you would update to the newest version and test it out that it is also working in your system.

mgoutrie commented 2 years ago

Hi Mads, Thanks a lot for the very quick response. And thanks for this library which saved me a lot o headaches.

I can prove, that the problem is not reproducible anymore.

But unfortunately, when using the current master branch (commit 7d16012), I came across an inconsistent calculation:

        [TestMethod]
        public void ConsistentEqualsOperations()
        {
            var voltMinutes = new BaseUnit(2.0, new EngineeringUnits.UnitSystem(ElectricPotentialUnit.Volt * DurationUnit.Minute, "V\u00b7min"));
            var weber = EngineeringUnits.MagneticFlux.FromWeber(2 * 60);
            Assert.IsTrue(voltMinutes == weber);
            Assert.IsTrue(weber == voltMinutes);
            Assert.IsTrue(weber.Equals(voltMinutes));
            Assert.IsTrue(voltMinutes.Equals(weber));
            Assert.AreEqual(0, weber.CompareTo(voltMinutes));
            // The following assertion fails:
            Assert.AreEqual(0, voltMinutes.CompareTo(weber));
        }

My expectation was (weber.CompareTo(voltMinutes) == 0) <=> (voltMinutes.CompareTo(weber) == 0)

I suspect the calculation of

        private decimal ConvertValueInto(BaseUnit From)
        {
            return (decimal)ConvertionFactor(From) * NEWValue;
        }

introduces a numeric error. Perhaps it should be better

return (decimal)(ConvertionFactor(From) * (Fraction)NEWValue);

which is consistent with the calculation in BaseUnit.GetValueAs() which is used in the equality operator.

PS: perhaps the calculation of BaseUnit.baseValue should be changed in a similar way, but I'm not really sure.

With best regards, Mike

MadsKirkFoged commented 2 years ago

You make my job really easy when you both find the bugs and the solutions! :-D Thanks a lot! If you find other bugs or have feature requests, just let me know!
I have added your test and update the nuget!

mgoutrie commented 2 years ago

Thanks a lot for your fast responses!

Indeed I may have two other topics, but I'll have to prepare this next week. For the moment I'd close this issue as it is fully resolved :)