pieceofsummer / Hangfire.Console

Job console extension for Hangfire
MIT License
436 stars 80 forks source link

Hangfire.Console unhandled exception crashes BackgroundJobServer #54

Open bbollard opened 6 years ago

bbollard commented 6 years ago

Hi, I'm evaluating Hangfire and your cool extension, Hangfire.Console. I am experiencing what appears to be, based on the stack trace, SQL based unhandled exceptions within the Hangfire.Console.Storage.ConsoleStorage.AddLine method is taking down the Hangfire BackgroundJobServer. I don't know if there is a certain design pattern for how you use Hangfire.Console or how your run the BackgroundJobServer to make it more resilient to these types of issues.

Your feedback is appreciated.

pieceofsummer commented 6 years ago

This is pretty strange, as any unhandled exceptions inside job method should take down the job, but not the entire BackgroundJobServer. Can you please provide some stack trace for further investigation?

bbollard commented 6 years ago

System.Transactions Critical: 0 : http://msdn.microsoft.com/TraceCodes/System/ActivityTracing/2004/07/Reliability/Exception/UnhandledUnhandled exceptionHangfireConsoleServer.exeSystem.InvalidOperationException, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use and max pool size was reached. at System.Data.ProviderBase.DbConnectionFactory.TryGetConnection(DbConnection owningConnection, TaskCompletionSource1 retry, DbConnectionOptions userOptions, DbConnectionInternal oldConnection, DbConnectionInternal&amp;amp; connection) at System.Data.ProviderBase.DbConnectionInternal.TryOpenConnectionInternal(DbConnection outerConnection, DbConnectionFactory connectionFactory, TaskCompletionSource1 retry, DbConnectionOptions userOptions) at System.Data.ProviderBase.DbConnectionClosed.TryOpenConnection(DbConnection outerConnection, DbConnectionFactory connectionFactory, TaskCompletionSource1 retry, DbConnectionOptions userOptions) at System.Data.SqlClient.SqlConnection.TryOpenInner(TaskCompletionSource1 retry) at System.Data.SqlClient.SqlConnection.TryOpen(TaskCompletionSource1 retry) at System.Data.SqlClient.SqlConnection.Open() at Hangfire.SqlServer.SqlServerStorage.CreateAndOpenConnection() at Hangfire.SqlServer.SqlServerStorage.UseConnection[T](DbConnection dedicatedConnection, Func2 func) at Hangfire.SqlServer.SqlServerStorage.UseTransaction[T](DbConnection dedicatedConnection, Func3 func, Nullable1 isolationLevel) at Hangfire.SqlServer.SqlServerStorage.UseTransaction(DbConnection dedicatedConnection, Action2 action) at Hangfire.SqlServer.SqlServerWriteOnlyTransaction.Commit() at Hangfire.Console.Storage.ConsoleStorage.AddLine(ConsoleId consoleId, ConsoleLine line) at Hangfire.Console.Server.ConsoleContext.AddLine(ConsoleLine line) at Hangfire.Console.Server.ConsoleContext.WriteLine(String value, ConsoleTextColor color) at Hangfire.Console.ConsoleExtensions.WriteLine(PerformContext context, String value) at HangFileEval.MyWorkerClass.WorkerThread(Object ctx) in C:\Users\xxx\Documents\Visual Studio 2017\Projects\HangFileEval\HangFileEval\MyWorkerClass.cs:line 190 at System.Threading.ThreadHelper.ThreadStart_Context(Object state) at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx) at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx) at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state) at System.Threading.ThreadHelper.ThreadStart(Object obj)</StackTrace><ExceptionString>System.InvalidOperationException: Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use and max pool size was reached. at System.Data.ProviderBase.DbConnectionFactory.TryGetConnection(DbConnection owningConnection, TaskCompletionSource1 retry, DbConnectionOptions userOptions, DbConnectionInternal oldConnection, DbConnectionInternal&amp; connection) at System.Data.ProviderBase.DbConnectionInternal.TryOpenConnectionInternal(DbConnection outerConnection, DbConnectionFactory connectionFactory, TaskCompletionSource1 retry, DbConnectionOptions userOptions) at System.Data.ProviderBase.DbConnectionClosed.TryOpenConnection(DbConnection outerConnection, DbConnectionFactory connectionFactory, TaskCompletionSource1 retry, DbConnectionOptions userOptions) at System.Data.SqlClient.SqlConnection.TryOpenInner(TaskCompletionSource1 retry) at System.Data.SqlClient.SqlConnection.TryOpen(TaskCompletionSource1 retry) at System.Data.SqlClient.SqlConnection.Open() at Hangfire.SqlServer.SqlServerStorage.CreateAndOpenConnection() at Hangfire.SqlServer.SqlServerStorage.UseConnection[T](DbConnection dedicatedConnection, Func2 func) at Hangfire.SqlServer.SqlServerStorage.UseTransaction[T](DbConnection dedicatedConnection, Func3 func, Nullable1 isolationLevel) at Hangfire.SqlServer.SqlServerStorage.UseTransaction(DbConnection dedicatedConnection, Action2 action) at Hangfire.SqlServer.SqlServerWriteOnlyTransaction.Commit() at Hangfire.Console.Storage.ConsoleStorage.AddLine(ConsoleId consoleId, ConsoleLine line) at Hangfire.Console.Server.ConsoleContext.AddLine(ConsoleLine line) at Hangfire.Console.Server.ConsoleContext.WriteLine(String value, ConsoleTextColor color) at Hangfire.Console.ConsoleExtensions.WriteLine(PerformContext context, String value) at HangFileEval.MyWorkerClass.WorkerThread(Object ctx) in C:\Users\xxx\Documents\Visual Studio 2017\Projects\HangFileEval\HangFileEval\MyWorkerClass.cs:line 190 at System.Threading.ThreadHelper.ThreadStart_Context(Object state) at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx) at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx) at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state) at System.Threading.ThreadHelper.ThreadStart(Object obj) An unhandled exception of type 'System.InvalidOperationException' occurred in System.Data.dll

I am using SQL Server 2016 and my connection string is not defining any connection pool settings.

Regards, Brian

bbollard commented 6 years ago

I am not sure if I am testing in a reliable way -- I'll change up how I'm testing but there's the stack trace -- it's taking out the Hangfire job itself because my code doesn't have any exception handling in it, either.

pieceofsummer commented 6 years ago

As I see it, you're having troubles getting a connection from connection pool, which is pretty much a stopper for Hangfire. There's a high probability it won't commit any further state changes even without Console present. Are you sure you don't have any connection limit set for your instance of SQL Server or anything?

You also need to check Hangfire log if it is really stopping all the processing because Hangfire workers are designed to retry even if there was an exception. It might be just the exception is logged, but other things keep processing.

teodimache commented 6 years ago

Hello,

We're experiencing a similar issue where a timeout occurring inside a call to ConsoleContext.AddLine takes down the entire BackgroundJobServer (look for FATAL log entries bellow)

This is strange as you can also see WARN log entries where the AutomaticRetryAttribute kicks in for timeous that occur in ConsoleStorage.InitConsole, which is expected. So for some reason only the ConsoleContext.AddLine takes down the server.

Note that the Hangfire BackgroundJobServer is running as a windows service with a Topshelf wrapper.

Log with stack trace: 2018-07-09 12:38:43.9156|WARN|Hangfire.AutomaticRetryAttribute|Failed to process the job '7559167': an exception occurred. Retry attempt 1 of 3 will be performed in 00:00:30. System.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding. ---> System.ComponentModel.Win32Exception (0x80004005): The wait operation timed out at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action1 wrapCloseInAction) at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action1 wrapCloseInAction) at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose) at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady) at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString, Boolean isInternal, Boolean forDescribeParameterEncryption) at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean async, Int32 timeout, Task& task, Boolean asyncWrite, Boolean inRetry, SqlDataReader ds, Boolean describeParameterEncryptionRequest) at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method, TaskCompletionSource1 completion, Int32 timeout, Task& task, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry) at System.Data.SqlClient.SqlCommand.InternalExecuteNonQuery(TaskCompletionSource1 completion, String methodName, Boolean sendToPipe, Int32 timeout, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry) at System.Data.SqlClient.SqlCommand.ExecuteNonQuery() at Hangfire.SqlServer.SqlCommandBatch.ExecuteNonQuery() at Hangfire.SqlServer.SqlServerConnection.<>cDisplayClass14_0.b0(DbConnection connection, DbTransaction transaction) at Hangfire.SqlServer.SqlServerStorage.<>cDisplayClass26_0.b0(DbConnection connection, DbTransaction transaction) at Hangfire.SqlServer.SqlServerStorage.<>cDisplayClass27_01.<UseTransaction>b__0(DbConnection connection) at Hangfire.SqlServer.SqlServerStorage.UseConnection[T](DbConnection dedicatedConnection, Func2 func) at Hangfire.SqlServer.SqlServerStorage.UseTransaction[T](DbConnection dedicatedConnection, Func3 func, Nullable1 isolationLevel) at Hangfire.SqlServer.SqlServerStorage.UseTransaction(DbConnection dedicatedConnection, Action2 action) at Hangfire.SqlServer.SqlServerConnection.SetRangeInHash(String key, IEnumerable1 keyValuePairs) at Hangfire.Console.Storage.ConsoleStorage.InitConsole(ConsoleId consoleId) at Hangfire.Console.Server.ConsoleContext..ctor(ConsoleId consoleId, IConsoleStorage storage) at Hangfire.Console.Server.ConsoleServerFilter.OnPerforming(PerformingContext filterContext) at Hangfire.Server.BackgroundJobPerformer.InvokePerformFilter(IServerFilter filter, PerformingContext preContext, Func1 continuation) ClientConnectionId:0ede2fc6-d2fb-4da3-9cc1-90150249f240 Error Number:-2,State:0,Class:11 2018-07-09 12:38:51.0312|**WARN**|Hangfire.AutomaticRetryAttribute|Failed to process the job '7559172': an exception occurred. Retry attempt 1 of 3 will be performed in 00:00:38. System.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding. ---> System.ComponentModel.Win32Exception (0x80004005): The wait operation timed out at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action1 wrapCloseInAction) at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action1 wrapCloseInAction) at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose) at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady) at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString, Boolean isInternal, Boolean forDescribeParameterEncryption) at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean async, Int32 timeout, Task& task, Boolean asyncWrite, Boolean inRetry, SqlDataReader ds, Boolean describeParameterEncryptionRequest) at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method, TaskCompletionSource1 completion, Int32 timeout, Task& task, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry) at System.Data.SqlClient.SqlCommand.InternalExecuteNonQuery(TaskCompletionSource`1 completion, String methodName, Boolean sendToPipe, Int32 timeout, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry) at System.Data.SqlClient.SqlCommand.ExecuteNonQuery() at Hangfire.SqlServer.SqlCommandBatch.ExecuteNonQuery() at Hangfire.SqlServer.SqlServerConnection.<>c__DisplayClass14_0.b0(DbConnection connection, DbTransaction transaction) at Hangfire.SqlServer.SqlServerStorage.<>cDisplayClass26_0.b0(DbConnection connection, DbTransaction transaction) at Hangfire.SqlServer.SqlServerStorage.<>cDisplayClass27_0`1.b0(DbConnection connection) at Hangfire.SqlServer.SqlServerStorage.UseConnection[T](DbConnection dedicatedConnection, Func2 func) at Hangfire.SqlServer.SqlServerStorage.UseTransaction[T](DbConnection dedicatedConnection, Func3 func, Nullable1 isolationLevel) at Hangfire.SqlServer.SqlServerStorage.UseTransaction(DbConnection dedicatedConnection, Action2 action) at Hangfire.SqlServer.SqlServerConnection.SetRangeInHash(String key, IEnumerable1 keyValuePairs) at Hangfire.Console.Storage.ConsoleStorage.InitConsole(ConsoleId consoleId) at Hangfire.Console.Server.ConsoleContext..ctor(ConsoleId consoleId, IConsoleStorage storage) at Hangfire.Console.Server.ConsoleServerFilter.OnPerforming(PerformingContext filterContext) at Hangfire.Server.BackgroundJobPerformer.InvokePerformFilter(IServerFilter filter, PerformingContext preContext, Func1 continuation) ClientConnectionId:956facb6-38ed-4d71-8737-8e18786af576 Error Number:-2,State:0,Class:11 2018-07-09 12:38:51.1852|FATAL|Topshelf.Runtime.Windows.WindowsServiceHost|The service threw an unhandled exception System.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding. ---> System.ComponentModel.Win32Exception (0x80004005): The wait operation timed out at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action1 wrapCloseInAction) at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action1 wrapCloseInAction) at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose) at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady) at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString, Boolean isInternal, Boolean forDescribeParameterEncryption) at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean async, Int32 timeout, Task& task, Boolean asyncWrite, Boolean inRetry, SqlDataReader ds, Boolean describeParameterEncryptionRequest) at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method, TaskCompletionSource1 completion, Int32 timeout, Task& task, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry) at System.Data.SqlClient.SqlCommand.InternalExecuteNonQuery(TaskCompletionSource1 completion, String methodName, Boolean sendToPipe, Int32 timeout, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry) at System.Data.SqlClient.SqlCommand.ExecuteNonQuery() at Hangfire.SqlServer.SqlCommandBatch.ExecuteNonQuery() at Hangfire.SqlServer.SqlServerWriteOnlyTransaction.b7_0(DbConnection connection, DbTransaction transaction) at Hangfire.SqlServer.SqlServerStorage.<>c__DisplayClass26_0.b0(DbConnection connection, DbTransaction transaction) at Hangfire.SqlServer.SqlServerStorage.<>cDisplayClass27_01.<UseTransaction>b__0(DbConnection connection) at Hangfire.SqlServer.SqlServerStorage.UseConnection[T](DbConnection dedicatedConnection, Func2 func) at Hangfire.SqlServer.SqlServerStorage.UseTransaction[T](DbConnection dedicatedConnection, Func3 func, Nullable1 isolationLevel) at Hangfire.SqlServer.SqlServerStorage.UseTransaction(DbConnection dedicatedConnection, Action`2 action) at Hangfire.SqlServer.SqlServerWriteOnlyTransaction.Commit() at Hangfire.Console.Storage.ConsoleStorage.AddLine(ConsoleId consoleId, ConsoleLine line) at Hangfire.Console.Server.ConsoleContext.AddLine(ConsoleLine line) at Hangfire.Console.Server.ConsoleContext.WriteLine(String value, ConsoleTextColor color) at Schlep.EtlRunner.<>c__DisplayClass1_0.b1(String stdout) at CliWrap.Services.BufferHandler.HandleStandardOutput(String line) at CliWrap.Cli.<>c__DisplayClass13_5.b__1(Object sender, DataReceivedEventArgs args) at System.Diagnostics.Process.OutputReadNotifyUser(String data) at System.Diagnostics.AsyncStreamReader.FlushMessageQueue() at System.Diagnostics.AsyncStreamReader.GetLinesFromStringBuilder() at System.Diagnostics.AsyncStreamReader.ReadBuffer(IAsyncResult ar) at System.IO.Stream.ReadWriteTask.InvokeAsyncCallback(Object completedTask) at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx) at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx) at System.IO.Stream.ReadWriteTask.System.Threading.Tasks.ITaskCompletionAction.Invoke(Task completingTask) at System.Threading.Tasks.Task.FinishContinuations() at System.Threading.Tasks.Task.FinishStageThree() at System.Threading.Tasks.Task.FinishStageTwo() at System.Threading.Tasks.Task.Finish(Boolean bUserDelegateExecuted) at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot) at System.Threading.Tasks.Task.ExecuteEntry(Boolean bPreventDoubleExecution) at System.Threading.Tasks.Task.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem() at System.Threading.ThreadPoolWorkQueue.Dispatch() at System.Threading._ThreadPoolWaitCallback.PerformWaitCallback() ClientConnectionId:a3c45b8a-8640-44eb-9987-dcfedda1dbac Error Number:-2,State:0,Class:11 2018-07-09 12:38:51.1892|INFO|Schlep.Server.Hangfire.HangfireService|Stopping Service.. 2018-07-09 12:38:51.1892|INFO|Hangfire.Server.Worker|Shutdown request requested while processing background job '7559171'. It will be re-queued. 2018-07-09 12:38:51.1892|INFO|Hangfire.Server.Worker|Shutdown request requested while processing background job '7559166'. It will be re-queued. 2018-07-09 12:38:51.1892|INFO|Hangfire.Server.Worker|Shutdown request requested while processing background job '7559164'. It will be re-queued. 2018-07-09 12:38:51.1892|INFO|Hangfire.Server.Worker|Shutdown request requested while processing background job '7559165'. It will be re-queued. 2018-07-09 12:38:51.2042|INFO|Hangfire.Server.Worker|Shutdown request requested while processing background job '7559026'. It will be re-queued. 2018-07-09 12:38:52.1333|INFO|Hangfire.Server.Worker|Shutdown request requested while processing background job '7559174'. It will be re-queued. 2018-07-09 12:38:52.1453|INFO|Hangfire.Server.Worker|Shutdown request requested while processing background job '7559176'. It will be re-queued. 2018-07-09 12:38:52.1573|INFO|Hangfire.Server.Worker|Shutdown request requested while processing background job '7559175'. It will be re-queued. 2018-07-09 12:38:52.1573|INFO|Hangfire.Server.Worker|Shutdown request requested while processing background job '7559178'. It will be re-queued. 2018-07-09 12:38:52.1763|INFO|Hangfire.Server.Worker|Shutdown request requested while processing background job '7559177'. It will be re-queued. 2018-07-09 12:38:52.1763|INFO|Hangfire.Server.Worker|Shutdown request requested while processing background job '7559180'. It will be re-queued. 2018-07-09 12:38:52.1943|INFO|Hangfire.Server.Worker|Shutdown request requested while processing background job '7559179'. It will be re-queued. 2018-07-09 12:38:52.2043|INFO|Hangfire.Server.Worker|Shutdown request requested while processing background job '7559181'. It will be re-queued. 2018-07-09 12:38:52.2043|INFO|Hangfire.Server.Worker|Shutdown request requested while processing background job '7559182'. It will be re-queued. 2018-07-09 12:38:52.2273|INFO|Hangfire.Server.Worker|Shutdown request requested while processing background job '7559183'. It will be re-queued. 2018-07-09 12:38:56.5276|INFO|Hangfire.Server.Worker|Shutdown request requested while processing background job '7559185'. It will be re-queued. 2018-07-09 12:38:56.5326|INFO|Hangfire.Server.Worker|Shutdown request requested while processing background job '7559187'. It will be re-queued. 2018-07-09 12:38:56.5326|INFO|Hangfire.Server.Worker|Shutdown request requested while processing background job '7559188'. It will be re-queued. 2018-07-09 12:38:59.6148|INFO|Hangfire.Server.Worker|Shutdown request requested while processing background job '7558920'. It will be re-queued. 2018-07-09 12:38:59.6148|INFO|Hangfire.Server.Worker|Shutdown request requested while processing background job '7559162'. It will be re-queued. 2018-07-09 12:39:00.3299|INFO|Hangfire.BackgroundJobServer|Hangfire Server stopped.