andrewdavey / cassette

Manages .NET web application assets (scripts, css and templates)
http://getcassette.net
MIT License
534 stars 143 forks source link

Infinite loop between ExceptionCatchingBundleCollectionInitializer and BundleCollection.GetReadLock #460

Open mhoyer opened 10 years ago

mhoyer commented 10 years ago

We recently found weird behavior related to Cassette in our web application for a quite large customer. Sometimes a HTTP requests from a users browser took minutes to respond on our live environment. We sadly couldn't reproduce it on our internal environments so the only way to find out about the actual reason was by dumping the w3p.exe process of live IIS and inspecting it with WinDbg. So fasten your seatbelt and see what we did:

Requirements

I used

Open WinDbg and with File > Load Crash Dump I loaded the dump.

Preparing the session

.loadby sos clr
.load sosex
!bhi

Find the threads with locks

0:0> !mlocks

ClrThread  DbgThread  OsThread    LockType    Lock              LockLevel
------------------------------------------------------------------------------
....
0xb2       113        0x663c0     thinlock    000000f441022170  (recursion:0)
0x8e  !!   140        0x68eb4     SyncBlock   000000f87cf57408                
0x8e  !!   140        0x68eb4     RWLockSlim  000000f240dbb3e0  Writer        
0x8e       140        0x68eb4     thinlock    000000f4014e8440  (recursion:0)
....

Okay. DbgThread 140 looks interesting. So I switched to it with:

0:0> ~140s

Now list the full stack trace of this thread:

0:140> !mk

Scrolling through the very long list of calls in the stack I found a lot (more than 100 times) of loops over those 8 lines:

0:140> !mk
...
19d:M 000000f18abdc080 00007ffcb55f8fd0 Cassette.Caching.ResourceAssetCacheValidator.IsValid(System.String, System.DateTime)(+0xb IL,+0x40 Native)
19e:M 000000f18abdc110 00007ffcb55f8d93 Cassette.Caching.ManifestValidator.Cassette.IBundleVisitor.Visit(Cassette.IAsset)(+0x28 IL,+0x63 Native)
--- 
19f:M 000000f18abdc150 00007ffcb55f892b Cassette.CachedBundleContent.Accept(Cassette.IBundleVisitor)(+0x17 IL,+0x5b Native)
1a0:M 000000f18abdc1a0 00007ffcb55f51d6 Cassette.Bundle.Accept(Cassette.IBundleVisitor)(+0x1f IL,+0xa6 Native)
1a1:M 000000f18abdc230 00007ffcb55f8b99 Cassette.Caching.ManifestValidator.IsValid(Cassette.Caching.Manifest)(+0x2f IL,+0x69 Native)
1a2:M 000000f18abdc280 00007ffcb0229fb8 Cassette.CacheAwareBundleCollectionInitializer.Initialize(Cassette.BundleCollection)(+0x42 IL,+0xa8 Native)
1a3:M 000000f18abdc2f0 00007ffcb0229e42 Cassette.ExceptionCatchingBundleCollectionInitializer.Initialize(Cassette.BundleCollection)(+0xc IL,+0x32 Native)
1a4:M 000000f18abdc330 00007ffcb55f91c9 Cassette.BundleCollection.GetReadLock()(+0x36 IL,+0xa9 Native)
1a5:M 000000f18abdc3b0 00007ffcb55f8fd0 Cassette.Caching.ResourceAssetCacheValidator.IsValid(System.String, System.DateTime)(+0xb IL,+0x40 Native)
1a6:M 000000f18abdc440 00007ffcb55f8d93 Cassette.Caching.ManifestValidator.Cassette.IBundleVisitor.Visit(Cassette.IAsset)(+0x28 IL,+0x63 Native)
--- 
1a7:M 000000f18abdc480 00007ffcb55f892b Cassette.CachedBundleContent.Accept(Cassette.IBundleVisitor)(+0x17 IL,+0x5b Native)
1a8:M 000000f18abdc4d0 00007ffcb55f51d6 Cassette.Bundle.Accept(Cassette.IBundleVisitor)(+0x1f IL,+0xa6 Native)
...

Thus, I inspected all involved types of those lines:

After digging into source code of Cassette for those particular types I found BundleCollection.GetReadLock() and ExceptionCatchingBundleCollectionInitializer quite interesting regarding the stateful property BundleCollection.InitializationException.

Now I ran the !mdso command for this particular type holding the BundleCollection.InitializationException and click the Object address link in the output:

0:140> !mdso /t:Cassette.BundleCollection

Thread 140:
Location          Object            Type
------------------------------------------------------------
000000f18abd2010  000000f240dbb340  Cassette.BundleCollection
                  ^^^^^^^^^^^^^^^^

This link (000000f240dbb340) automatically executed this command:

0:140> !mdt 000000f240dbb340
000000f240dbb340 (Cassette.BundleCollection)
    bundles:000000f240dbb398 (System.Collections.Generic.List`1[[Cassette.Bundle, Cassette]])
    settings:000000f240d9d0f8 (Cassette.CassetteSettings)
    fileSearchProvider:000000f240dba978 (Cassette.FileSearchProvider)
    bundleFactoryProvider:000000f240dba9e8 (Cassette.BundleFactoryProvider)
    bundleCollectionInitializer:000000f240dbb328 (Cassette.ExceptionCatchingBundleCollectionInitializer)
    readerWriterLock:000000f240dbb3e0 (System.Threading.ReaderWriterLockSlim)
    bundleImmediateReferences:000000f240e376d8 (System.Collections.Generic.Dictionary`2[[Cassette.Bundle, Cassette],[Cassette.Utilities.HashedSet`1[[Cassette.Bundle, Cassette]], Cassette]])
!!  initializationException:000000f441dd5ed8 (System.Threading.ThreadAbortException)
    Changed:000000f240dbc3c8 (System.EventHandler`1[[Cassette.BundleCollectionChangedEventArgs, Cassette]])

See the initializationException field inside the output? It's of type ThreadAbortException!

My guess is the IIS app pool was recycled, kicked, thread-aborted, whatever and now the looping starts. Especially in conjunction with the implementation of ExceptionCatchingBundleCollectionInitializer: I am wondering why there is no more concrete exception type matching in the try-catch block of its Initialize method?

mhoyer commented 10 years ago

Side notes

Related sources

joekrill commented 10 years ago

I, too, am having this problem. It is extremely sporadic, and I haven't been able to figure out how to consistently reproduce it at will. But when it DOES happen, it continues to happen consistently for a while thereafter (that is, it's not just a random one-off thing every so often -- once it occurs, it continues occuring). I usually try things like: deleting the Cassette cache, rebuilding my project, modifying the web.config settings... and at some point it will start working again. But I can't find one particular thing that seems to fix the problem.

One thing that seems to instigate this issue more regularly is modifying or adding files to the file system. For example, I'm working with ReactJS and have a weather component. I dropped in a "weather-icons" directory where my script is, which contains some CSS and font files. And that caused cassette to blow up with this StackOverflowException. I've also experienced it after just modifying some of the javascript files themselves.

I'll continue to investigate but if anyone has any ideas -- or something more specific -- I'm happy to test or provide more feedback.

Some details about my setup:

kamranayub commented 8 years ago

Perhaps related perhaps not, but I was just experiencing an issue I traced down to Cassette which has similar symptoms--except it was EVERY request that used Cassete was causing a stack overflow and crashing my w3wp process. The cause was I was missing JS files when my project built in Azure, so references (I assume) were missing and Cassette was having some trouble. The symptom I saw was 502 Bad Gateway errors on every request that included a reference to a Cassette bundle. My MVC actions that didn't return a full view worked fine and once I discovered the cause (missing JS files), adding them back fixed it.

Now that I know what causes it, I could probably try reproducing it locally to see what Cassette is doing.