jjchiw / gelf4net

GELF log4net Appender - graylog2
MIT License
63 stars 59 forks source link

Web app hang from thread lock after AsyncGelfUdpAppender.Append() #68

Closed surachai-tep closed 3 years ago

surachai-tep commented 5 years ago

Hi, We've found our web application hang after started using gelf4net on some of our production server. The problem comes a few hour after application start. We decided to dump memory while it hang for analysis and here is what we found.

------Type of Analysis Performed Hang Analysis------------ Machine Name
Operating System Windows Server 2008 R2Service Pack 1 Number Of Processors:
Process ID 9856 Process Image
Command Line
System Up-Time 100 day(s) 14:55:06 Process Up-Time 02:54:09 Processor Type X64 Process Bitness 64-Bit

------------- .NET Threads Summary -------------- Total Threads: 569 Running Threads: 569 Idle Threads: 0 Max Threads: 32767 Min Threads: 120


82.32% of threads blocked (568 threads) Look at the callstack of thread 143 to see why that thread id waiting on and why it is not releasing the lock.

--------- Stacktrace of thread no. 143 ---------------

[[HelperMethodFrame_1OBJ] (System.Threading.Monitor.ObjWait)] System.Threading.Monitor.ObjWait(Boolean, Int32, System.Object) 
mscorlib_ni!System.Threading.SemaphoreSlim.WaitUntilCountOrTimeout(Int32, UInt32, System.Threading.CancellationToken)+68 
mscorlib_ni!System.Threading.SemaphoreSlim.Wait(Int32, System.Threading.CancellationToken)+149 
System_ni!System.Collections.Concurrent.BlockingCollection`1[[System.__Canon, mscorlib]].TryAddWithNoTimeValidation(System.__Canon, Int32, System.Threading.CancellationToken)+104 
Gelf4Net.Appender.AsyncGelfUdpAppender.Append(log4net.Core.LoggingEvent)+61 
log4net.Appender.AppenderSkeleton.DoAppend(log4net.Core.LoggingEvent)+fa 
log4net.Util.AppenderAttachedImpl.AppendLoopOnAppenders(log4net.Core.LoggingEvent)+a6 
log4net.Repository.Hierarchy.Logger.CallAppenders(log4net.Core.LoggingEvent)+70 
log4net.Repository.Hierarchy.Logger.Log(System.Type, log4net.Core.Level, System.Object, System.Exception)+66 
log4net.Core.LogImpl.Info(System.Object)+47 

----------- Many threads that were blocked by 143 were on Enter lock state ----------------

[[HelperMethodFrame] (System.Threading.Monitor.Enter)] System.Threading.Monitor.Enter(System.Object) 
log4net.Appender.AppenderSkeleton.DoAppend(log4net.Core.LoggingEvent)+45 
log4net.Util.AppenderAttachedImpl.AppendLoopOnAppenders(log4net.Core.LoggingEvent)+a6 
log4net.Repository.Hierarchy.Logger.CallAppenders(log4net.Core.LoggingEvent)+70 
log4net.Repository.Hierarchy.Logger.Log(System.Type, log4net.Core.Level, System.Object, System.Exception)+66 
log4net.Core.LogImpl.Info(System.Object)+47 

---------- Here is our appender configuration -------

<appender name="AsyncGelfUdpAppender" type="Gelf4Net.Appender.AsyncGelfUdpAppender, Gelf4Net">
        <remoteAddress value="10.94.24.104" />
        <remotePort value="12201" />
        <bufferSize value="200" />
        <threads value="2" />
        <layout type="Gelf4Net.Layout.GelfLayout, Gelf4Net">
            <param name="AdditionalFields" value="App:Web,Version:2.0,Level:%level,Exception:%exception" />
            <param name="Facility" value="OurApp" />
            <param name="IncludeLocationInformation" value="false" />
            <param name="SendTimeStampAsString" value="false"/>
            <param name="LogStackTraceFromMessage" value="true" />

        </layout>
</appender>

Is there anything wrong on our configuration that cause thread block? Any idea or suggestion would be appreciated.

Thank you

jjchiw commented 5 years ago

Hi

Can you test with GelfUdpAppender?

I'm going to create a load tester with siege during the weekend and try to find the issue

surachai-tep commented 5 years ago

Hi jjchiew, We try GelfUdpAppender today and found it's working fine with no hang for 8 hours. We'll keep monitoring for week to make sure that it'll not get the same issue like the async one.

Thanks.

jjchiw commented 5 years ago

Great!

I'll take a look to all the Async*Appenders