dotnetcore / EasyCaching

:boom: EasyCaching is an open source caching library that contains basic usages and some advanced usages of caching which can help us to handle caching more easier!
MIT License
1.97k stars 325 forks source link

Rare stack overflow error in EasyCaching.Disk #523

Closed hakakou closed 7 months ago

hakakou commented 8 months ago

Description

Stack overflow error in ASP.NET Core IIS points to EasyCaching.Disk as culprit.

Steps to Reproduce

Cannot reproduce. Happens in random every 2-3 days on IIS server running ASP.NET Core 8 project. Because it's a stack overflow error, the only way to get logs from the process is the following:

1) Add environment variable and restart IIS

DOTNET_DbgEnableMiniDump=1

2) In web.config, change the following to

<aspNetCore processPath=".\Ap.Hub.Blazor.exe" stdoutLogEnabled="true" 
      stdoutLogFile=".\logs\stdout" hostingModel="inprocess">
  <handlerSettings>
    <handlerSetting name="debugFile" value=".\logs\aspnetcore-debug.log" />
    <handlerSetting name="debugLevel" value="FILE,TRACE" />
  </handlerSettings>
/aspNetCore>

Related code

Stack overflow.
   at System.Runtime.CompilerServices.RuntimeHelpers.AllocateUninitializedClone(System.Object)
   at EasyCaching.Disk.DefaultDiskCachingProvider.GetMd5Str(System.String)
   at EasyCaching.Disk.DefaultDiskCachingProvider.BuildMd5Path(System.String)
   at EasyCaching.Disk.DefaultDiskCachingProvider+<BaseGetAsync>d__4`1[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[EasyCaching.Disk.DefaultDiskCachingProvider+<BaseGetAsync>d__4`1[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], EasyCaching.Disk, Version=1.9.2.0, Culture=neutral, PublicKeyToken=null]](<BaseGetAsync>d__4`1<System.__Canon> ByRef)
   at EasyCaching.Disk.DefaultDiskCachingProvider.BaseGetAsync[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.String, System.Func`1<System.Threading.Tasks.Task`1<System.__Canon>>, System.TimeSpan, System.Threading.CancellationToken)
   at EasyCaching.Core.EasyCachingAbstractProvider+<GetAsync>d__84`1[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[EasyCaching.Core.EasyCachingAbstractProvider+<GetAsync>d__84`1[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], EasyCaching.Core, Version=1.9.2.0, Culture=neutral, PublicKeyToken=null]](<GetAsync>d__84`1<System.__Canon> ByRef)
   at EasyCaching.Core.EasyCachingAbstractProvider.GetAsync[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.String, System.Func`1<System.Threading.Tasks.Task`1<System.__Canon>>, System.TimeSpan, System.Threading.CancellationToken)
   at EasyCaching.Disk.DefaultDiskCachingProvider+<BaseGetAsync>d__4`1[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[EasyCaching.Disk.DefaultDiskCachingProvider+<BaseGetAsync>d__4`1[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], EasyCaching.Disk, Version=1.9.2.0, Culture=neutral, PublicKeyToken=null]](<BaseGetAsync>d__4`1<System.__Canon> ByRef)
   at EasyCaching.Disk.DefaultDiskCachingProvider.BaseGetAsync[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.String, System.Func`1<System.Threading.Tasks.Task`1<System.__Canon>>, System.TimeSpan, System.Threading.CancellationToken)
   at EasyCaching.Core.EasyCachingAbstractProvider+<GetAsync>d__84`1[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[EasyCaching.Core.EasyCachingAbstractProvider+<GetAsync>d__84`1[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], EasyCaching.Core, Version=1.9.2.0, Culture=neutral, PublicKeyToken=null]](<GetAsync>d__84`1<System.__Canon> ByRef)
   at EasyCaching.Core.EasyCachingAbstractProvider.GetAsync[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.String, System.Func`1<System.Threading.Tasks.Task`1<System.__Canon>>, System.TimeSpan, System.Threading.CancellationToken)
   at EasyCaching.Disk.DefaultDiskCachingProvider+<BaseGetAsync>d__4`1[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[EasyCaching.Disk.DefaultDiskCachingProvider+<BaseGetAsync>d__4`1[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], EasyCaching.Disk, Version=1.9.2.0, Culture=neutral, PublicKeyToken=null]](<BaseGetAsync>d__4`1<System.__Canon> ByRef)
   at EasyCaching.Disk.DefaultDiskCachingProvider.BaseGetAsync[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.String, System.Func`1<System.Threading.Tasks.Task`1<System.__Canon>>, System.TimeSpan, System.Threading.CancellationToken)
   at EasyCaching.Core.EasyCachingAbstractProvider+<GetAsync>d__84`1[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].MoveNext()

................. Repeats

Specifications

dplarina commented 7 months ago

So this appears to be a symptom of another problem. I enabled logging and noticed that it was repeatedly logging "Cache Missed : cachekey = test-en.json".

The core issue appears to be that the keys.dat (and therefore the _cacheKeysMap in the disk caching provider) somehow gets into a broken state with a lock set for the cache key that is never cleared for some reason.

The TryAdd returns false here as there is an existing lock https://github.com/dotnetcore/EasyCaching/blob/914d7e8027fd1202ddb50d9f4ece5e10fc72fa20/src/EasyCaching.Disk/DefaultDiskCachingProvider.Async.cs#L130-L134

If sleeps for SleepMs and calls GetAsync which starts another recursion. Since the lock is never released, we end up with a stackoverflow after a number of failures.

Even restarting the app doesn't seem to resolve it as the lock is stored in the keys.dat

Here is my keys.dat

C:\Dev\.cache\test.json_Lock
C:\Dev\.cache\test.html
5E88E187872DFD1AA0F7882D94D324FF
C:\Dev\.cache\test.css

Maybe the provider should remove, or at least ignore any existing locks in the cacheKeyMap on start? Or there should be a limit to the number of retry attempts?

Ideally it shouldn't get into a broken state in the first place, but I'm not sure how that happens.