HangfireIO / Hangfire

An easy way to perform background job processing in .NET and .NET Core applications. No Windows Service or separate process required
https://www.hangfire.io
Other
9.38k stars 1.7k forks source link

Jobs are en-queued but it's not processing #1218

Closed Damodark closed 2 years ago

Damodark commented 6 years ago

Hi Team,

I've the job which is reading some data from sql db and adding that in console. Seems like heartbeat is fine and jobs are enqueued but the processing part is not working. Have attached screen shots of server and jobs. Would you please let me know what am I missing.

hangfirequeue hangfireserver

Best Regards, Damodar

witskeeper commented 6 years ago

same problem

svachha commented 5 years ago

same problem. any help appreciated

sethsteenken commented 5 years ago

Also having the same issue. Restarting the windows service (which runs the Hangfire service) fixes the issue for a while, even a couple days. Then all of a sudden jobs are en-queued and not picked up for processing. Dashboard (on separate web app) says the Hangfire server is live and connected and good heartbeat, etc.

odinserj commented 5 years ago

Please use the STDump utility to obtain stack traces of your threads, when everything is blocked, and post them here.

svachha commented 5 years ago

I think its relates to azure servicebus. After job is processed if you re queue using dashboard it stays stuck

sethsteenken commented 5 years ago

Sure thing, thank you for the quick response. I currently have queued jobs that are not getting picked up. I also notice that now my server heartbeat is 7 hours. Here's the output of running stdump on the server process:

STDump v0.1 - Writes process managed stack traces
Copyright (C) 2016 Sergey Odinokov
Based on Microsoft.Diagnostics.Runtime (CLR MD). Copyright (C) Microsoft Corporation.

Writes managed stack trace of a running process, or from a minidump file to console.

Found CLR v4.7.3221.00

Following AppDomains found:

AppDomain #1
  Address:       21228336
  Name:          MY.Processing.Service.exe
  Configuration: D:\Applications\MY_APP\Processing\MY.Processing.Service.exe.Config
  Directory:     D:\Applications\MY_APP\Processing\

Following threads found:

Thread #1
  OS Thread ID:      3296
  AppDomain Address: 21228336
  Type:              Foreground

  Managed stack trace:
   - [InlinedCallFrame] at UNKNOWN
   - DomainBoundILStubClass.IL_STUB_PInvoke(IntPtr) at System.ServiceProcess
   - [InlinedCallFrame] at UNKNOWN
   - System.ServiceProcess.ServiceBase.Run(System.ServiceProcess.ServiceBase[]) at System.ServiceProcess
   - MY.Processing.Service.Program.Main() at MY.Processing.Service
   - [GCFrame] at UNKNOWN

Thread #2
  OS Thread ID:      3764
  AppDomain Address: 21228336
  Type:              Background
  Role:              Finalizer

  Managed stack trace:
   - [DebuggerU2MCatchHandlerFrame] at UNKNOWN

Thread #6
  OS Thread ID:      3088
  AppDomain Address: 21228336
  Type:              Background
  Role:              Threadpool Worker

  No managed stack trace found for this thread.

Thread #7
  OS Thread ID:      2904
  AppDomain Address: 21228336
  Type:              Background

  Managed stack trace:
   - [GCFrame] at UNKNOWN
   - [HelperMethodFrame_1OBJ] (System.Threading.Monitor.ObjWait) at mscorlib
   - System.Threading.Monitor.Wait(System.Object, Int32, Boolean) at mscorlib
   - System.Threading.Monitor.Wait(System.Object, Int32) at mscorlib
   - System.Threading.ManualResetEventSlim.Wait(Int32, System.Threading.CancellationToken) at mscorlib
   - System.Threading.Tasks.Task.WaitAllBlockingCore(System.Collections.Generic.List`1<System.Threading.Tasks.Task>, Int32, System.Threading.CancellationToken) at mscorlib
   - System.Threading.Tasks.Task.WaitAll(System.Threading.Tasks.Task[], Int32, System.Threading.CancellationToken) at mscorlib
   - Hangfire.Server.BackgroundProcessingServer.Hangfire.Server.IBackgroundProcess.Execute(Hangfire.Server.BackgroundProcessContext) at Hangfire.Core
   - Hangfire.Server.ServerProcessExtensions.Execute(Hangfire.Server.IServerProcess, Hangfire.Server.BackgroundProcessContext) at Hangfire.Core
   - Hangfire.Server.AutomaticRetryProcess.Execute(Hangfire.Server.BackgroundProcessContext) at Hangfire.Core
   - Hangfire.Server.ServerProcessExtensions.Execute(Hangfire.Server.IServerProcess, Hangfire.Server.BackgroundProcessContext) at Hangfire.Core
   - Hangfire.Server.InfiniteLoopProcess.Execute(Hangfire.Server.BackgroundProcessContext) at Hangfire.Core
   - Hangfire.Server.ServerProcessExtensions.Execute(Hangfire.Server.IServerProcess, Hangfire.Server.BackgroundProcessContext) at Hangfire.Core
   - Hangfire.Server.ServerProcessExtensions.RunProcess(Hangfire.Server.IServerProcess, Hangfire.Server.BackgroundProcessContext) at Hangfire.Core
   - Hangfire.Server.ServerProcessExtensions+<>c__DisplayClass1_0.<CreateTask>b__0() at Hangfire.Core
   - System.Threading.Tasks.Task.InnerInvoke() at mscorlib
   - System.Threading.Tasks.Task.Execute() at mscorlib
   - System.Threading.Tasks.Task.ExecutionContextCallback(System.Object) at mscorlib
   - System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) at mscorlib
   - System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) at mscorlib
   - System.Threading.Tasks.Task.ExecuteWithThreadLocal(System.Threading.Tasks.Task ByRef) at mscorlib
   - System.Threading.Tasks.Task.ExecuteEntry(Boolean) at mscorlib
   - System.Threading.Tasks.ThreadPoolTaskScheduler.LongRunningThreadWork(System.Object) at mscorlib
   - System.Threading.ThreadHelper.ThreadStart_Context(System.Object) at mscorlib
   - System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) at mscorlib
   - System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) at mscorlib
   - System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) at mscorlib
   - System.Threading.ThreadHelper.ThreadStart(System.Object) at mscorlib
   - [GCFrame] at UNKNOWN
   - [DebuggerU2MCatchHandlerFrame] at UNKNOWN

Thread #10
  OS Thread ID:      724
  AppDomain Address: 21228336
  Type:              Background

  Managed stack trace:
   - [HelperMethodFrame_1OBJ] (System.Threading.WaitHandle.WaitOneNative) at mscorlib
   - System.Threading.WaitHandle.InternalWaitOne(System.Runtime.InteropServices.SafeHandle, Int64, Boolean, Boolean) at mscorlib
   - System.Threading.WaitHandle.WaitOne(System.TimeSpan, Boolean) at mscorlib
   - System.Threading.WaitHandle.WaitOne(System.TimeSpan) at mscorlib
   - Hangfire.SqlServer.ExpirationManager.Execute(System.Threading.CancellationToken) at Hangfire.SqlServer
   - Hangfire.Server.ServerProcessExtensions.Execute(Hangfire.Server.IServerProcess, Hangfire.Server.BackgroundProcessContext) at Hangfire.Core
   - Hangfire.Server.AutomaticRetryProcess.Execute(Hangfire.Server.BackgroundProcessContext) at Hangfire.Core
   - Hangfire.Server.ServerProcessExtensions.Execute(Hangfire.Server.IServerProcess, Hangfire.Server.BackgroundProcessContext) at Hangfire.Core
   - Hangfire.Server.InfiniteLoopProcess.Execute(Hangfire.Server.BackgroundProcessContext) at Hangfire.Core
   - Hangfire.Server.ServerProcessExtensions.Execute(Hangfire.Server.IServerProcess, Hangfire.Server.BackgroundProcessContext) at Hangfire.Core
   - Hangfire.Server.ServerProcessExtensions.RunProcess(Hangfire.Server.IServerProcess, Hangfire.Server.BackgroundProcessContext) at Hangfire.Core
   - Hangfire.Server.ServerProcessExtensions+<>c__DisplayClass1_0.<CreateTask>b__0() at Hangfire.Core
   - System.Threading.Tasks.Task.InnerInvoke() at mscorlib
   - System.Threading.Tasks.Task.Execute() at mscorlib
   - System.Threading.Tasks.Task.ExecutionContextCallback(System.Object) at mscorlib
   - System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) at mscorlib
   - System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) at mscorlib
   - System.Threading.Tasks.Task.ExecuteWithThreadLocal(System.Threading.Tasks.Task ByRef) at mscorlib
   - System.Threading.Tasks.Task.ExecuteEntry(Boolean) at mscorlib
   - System.Threading.Tasks.ThreadPoolTaskScheduler.LongRunningThreadWork(System.Object) at mscorlib
   - System.Threading.ThreadHelper.ThreadStart_Context(System.Object) at mscorlib
   - System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) at mscorlib
   - System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) at mscorlib
   - System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) at mscorlib
   - System.Threading.ThreadHelper.ThreadStart(System.Object) at mscorlib
   - [GCFrame] at UNKNOWN
   - [DebuggerU2MCatchHandlerFrame] at UNKNOWN

Thread #23
  OS Thread ID:      5860
  AppDomain Address: 21228336
  Type:              Background

  Managed stack trace:
   - [HelperMethodFrame_1OBJ] (System.Threading.WaitHandle.WaitOneNative) at mscorlib
   - System.Threading.WaitHandle.InternalWaitOne(System.Runtime.InteropServices.SafeHandle, Int64, Boolean, Boolean) at mscorlib
   - System.Threading.WaitHandle.WaitOne(System.TimeSpan, Boolean) at mscorlib
   - System.Threading.WaitHandle.WaitOne(System.TimeSpan) at mscorlib
   - Hangfire.Server.EveryMinuteThrottler.WaitASecondOrThrowIfCanceled(System.Threading.CancellationToken) at Hangfire.Core
   - Hangfire.Server.EveryMinuteThrottler.Throttle(System.Threading.CancellationToken) at Hangfire.Core
   - Hangfire.Server.RecurringJobScheduler.Execute(Hangfire.Server.BackgroundProcessContext) at Hangfire.Core
   - Hangfire.Server.ServerProcessExtensions.Execute(Hangfire.Server.IServerProcess, Hangfire.Server.BackgroundProcessContext) at Hangfire.Core
   - Hangfire.Server.AutomaticRetryProcess.Execute(Hangfire.Server.BackgroundProcessContext) at Hangfire.Core
   - Hangfire.Server.ServerProcessExtensions.Execute(Hangfire.Server.IServerProcess, Hangfire.Server.BackgroundProcessContext) at Hangfire.Core
   - Hangfire.Server.InfiniteLoopProcess.Execute(Hangfire.Server.BackgroundProcessContext) at Hangfire.Core
   - Hangfire.Server.ServerProcessExtensions.Execute(Hangfire.Server.IServerProcess, Hangfire.Server.BackgroundProcessContext) at Hangfire.Core
   - Hangfire.Server.ServerProcessExtensions.RunProcess(Hangfire.Server.IServerProcess, Hangfire.Server.BackgroundProcessContext) at Hangfire.Core
   - Hangfire.Server.ServerProcessExtensions+<>c__DisplayClass1_0.<CreateTask>b__0() at Hangfire.Core
   - System.Threading.Tasks.Task.InnerInvoke() at mscorlib
   - System.Threading.Tasks.Task.Execute() at mscorlib
   - System.Threading.Tasks.Task.ExecutionContextCallback(System.Object) at mscorlib
   - System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) at mscorlib
   - System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) at mscorlib
   - System.Threading.Tasks.Task.ExecuteWithThreadLocal(System.Threading.Tasks.Task ByRef) at mscorlib
   - System.Threading.Tasks.Task.ExecuteEntry(Boolean) at mscorlib
   - System.Threading.Tasks.ThreadPoolTaskScheduler.LongRunningThreadWork(System.Object) at mscorlib
   - System.Threading.ThreadHelper.ThreadStart_Context(System.Object) at mscorlib
   - System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) at mscorlib
   - System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) at mscorlib
   - System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) at mscorlib
   - System.Threading.ThreadHelper.ThreadStart(System.Object) at mscorlib
   - [GCFrame] at UNKNOWN
   - [DebuggerU2MCatchHandlerFrame] at UNKNOWN

Thread #53
  OS Thread ID:      6112
  AppDomain Address: 21228336
  Type:              Background
  Role:              Threadpool Worker

  No managed stack trace found for this thread.

Thread #54
  OS Thread ID:      3860
  AppDomain Address: 21228336
  Type:              Background
  Role:              Threadpool Worker

  No managed stack trace found for this thread.
odinserj commented 5 years ago

@sethsteenken, thanks for the stack trace. I don't see any worker threads, and without the logs it's hard to say what happened. This may be caused by user code that throws the ThreadAbortException or by something more serious. Try to collect some logging messages – they should show the exception and its stack trace.

mglukhov commented 5 years ago

Sergey, I currently have the same problem, although Hangfire user to work succressfully for 2 years and without any problems.

mglukhov commented 5 years ago

In our case the problem was that we updated Hangfire.SqlServer.dll to version 1.6.21 from version 1.6.6.0, but Hangfire.Core.dll remained of version 1.6.6.0.

This caused the jobs being queued but not processed, and the Hangfire portal showed no error messages whatsoever, which made quite difficult to understand the cause of this problem.

NeenuSunil commented 5 years ago

Is there any update on this issue?

odinserj commented 5 years ago

@NeenuSunil, please provide any logs with exception stack traces, dashboard screenshots, configuration logic and output of the STDump to diagnose what happens.

SebFerraro commented 5 years ago

We are having the same issue with Hangfire. The following versions are installed:

Hangfire.Core 1.7.6 Hangfire.Dashboard.Management 1.7.5 Hangfire.Dashboard.Authorization 2.0.0 Hangfire.DocumentDB 2.1.0

Would you like me try and collect any additional logs by some means?

We had almost 3200 jobs enqueued overnight.

NeenuSunil commented 5 years ago

We don't have any issues now. We resolved the problem by 1.Updated Hangfire to 1.7.3 2.Renamed the Hangfire schema from "REPORT-Hangfire" to "ReportHangfire" to remove hyphen as there's a bug when schema names include hyphens.

Gheri commented 5 years ago

@NeenuSunil Can you point me the documentation which says there is a bug with hangfire if prefix names include hypen. I m using hangfire with redis. I have faced the above issue with Hangfire.Core 1.6 as well as 1.7.6 but i have noticed that my prefix names have hypens. I wanted to know if we have known issues list of hangfire which says that prefix names whould have hypens

kendaleiv commented 5 years ago

Submitted what is hopefully a fix for issues with dashes in schema name: https://github.com/HangfireIO/Hangfire/pull/1531

kunaldhande commented 4 years ago

Any update on this issue?

We are having the same issue with Hangfire. The following versions are installed: Running on .net core 3.0 Hangfire.AspNetCore 1.7.7 Hangfire.Core 1.7.7 Hangfire.SqlServer 1.7.7

The problem is only when I deploy it on azure app service linux, not on my local machine

georgeemr commented 4 years ago

I'm having this issue, some jobs are not processing and have days in the queue, and these jobs last at least 15 seconds to complete.

image

Gheri commented 4 years ago

Which Hangfire Version are you using? Are you redis Or sql?? If you are using redis, is it cluster mode enabled??

On Fri, 29 May 2020, 22:22 George Universe, notifications@github.com wrote:

I'm having this issue, some jobs are not processing and have days in the queue, and these jobs last at least 15 seconds to complete.

[image: image] https://user-images.githubusercontent.com/3822009/83284813-7b117a80-a19a-11ea-92f5-b4ab35b5fe81.png

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/HangfireIO/Hangfire/issues/1218#issuecomment-636076336, or unsubscribe https://github.com/notifications/unsubscribe-auth/AES6XSMQEWVO7BWFFF63GNLRT7R6JANCNFSM4FITZ2UQ .

jenswachtel commented 4 years ago

@kunaldhande we are having the same issues.

We used to be on Windows App Service and no issues. We recently migrated to a linux VM in azure and since then sporadically it freezes, even after a job successfully finished, the queue does not continue. We have to restart the service.. It's a really critical error for us. We have no idea how to troubleshoot as we don't find anything in logs.

Version:

Screen Shot 2020-05-31 at 10 30 09 AM

Using PostgreSQL, but before Memory and both have the same issues, I wouldn't say its DB related

georgeemr commented 4 years ago

Which Hangfire Version are you using? Are you redis Or sql?? If you are using redis, is it cluster mode enabled?? On Fri, 29 May 2020, 22:22 George Universe, @.***> wrote: I'm having this issue, some jobs are not processing and have days in the queue, and these jobs last at least 15 seconds to complete. [image: image] https://user-images.githubusercontent.com/3822009/83284813-7b117a80-a19a-11ea-92f5-b4ab35b5fe81.png — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1218 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AES6XSMQEWVO7BWFFF63GNLRT7R6JANCNFSM4FITZ2UQ .

Currently using SQL Server on Windows Server 2019. Hangfire version: 1.7.11 image

AelithBlanchett commented 4 years ago

Same error, using Hangfire version 1.7.11 on Linux, .NET Core 2.2.403. It happens randomly.

Mittemi commented 4 years ago

We got the same problem using Hangfire 1.7.11 and Hangfire Pro 2.2.2 on Ubuntu 1804 with .NET Core 3.1.301.

Storage is Azure SQL (S1 Tier)

Most jobs are stuck in the enqueued state or fail to transition to successful state upon work completion.

Any support highly is appreciated

AelithBlanchett commented 4 years ago

The only workaround for it that I've found is to completely restart the dotnet process every 8 hours or so, which is a real PITA.

I quite like Hangfire and don't want to replace it, but having to restart the whole stack just for that won't be feasible in the long-term. I wonder if anyone else has found a better workaround, or even a fix?

odinserj commented 4 years ago

It's almost impossible to find out what's happened without additional information, please see the following link and describe everything, including "ProTips" section. There are a lot of reasons for this to happen, including different deadlocks in background job methods themselves.

https://github.com/HangfireIO/Hangfire/blob/master/CONTRIBUTING.md

AelithBlanchett commented 4 years ago

I had saved the logs back then, when I knew for sure that the process had stalled in the same hour. The UI was still working, the jobs were just stuck in the enqueued state. There's only one line that's odd there, about the "DelayedJobScheduler recovered from the Faulted state". I'll try to post another set of logs as soon as the problem reappears, and maybe get the stack dump too.

Hangfire v1.7.11 Hangfire.MySql.Core v2.2.5 Linux (Debian) .NET Core 2.2.403 MariaDB

DEBUG 2020-06-13 15:47:02,582 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - MySqlDistributedLock resource=locks:schedulepoller, timeout=00:01:00
DEBUG 2020-06-13 15:47:02,582 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Acquire resource=locks:schedulepoller, timeout=00:01:00
DEBUG 2020-06-13 15:47:02,604 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Release resource=locks:schedulepoller
DEBUG 2020-06-13 15:47:03,675 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - MySqlDistributedLock resource=recurring-jobs:lock, timeout=00:01:00
DEBUG 2020-06-13 15:47:03,675 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Acquire resource=recurring-jobs:lock, timeout=00:01:00
DEBUG 2020-06-13 15:47:03,704 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Release resource=recurring-jobs:lock
DEBUG 2020-06-13 15:47:06,629 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - MySqlDistributedLock resource=locks:schedulepoller, timeout=00:01:00
DEBUG 2020-06-13 15:47:06,629 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Acquire resource=locks:schedulepoller, timeout=00:01:00
DEBUG 2020-06-13 15:47:07,770 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - MySqlDistributedLock resource=recurring-jobs:lock, timeout=00:01:00
DEBUG 2020-06-13 15:47:07,770 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Acquire resource=recurring-jobs:lock, timeout=00:01:00
DEBUG 2020-06-13 15:47:36,695 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Release resource=recurring-jobs:lock
DEBUG 2020-06-13 15:47:37,660 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - MySqlDistributedLock resource=locks:schedulepoller, timeout=00:01:00
DEBUG 2020-06-13 15:47:37,660 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Acquire resource=locks:schedulepoller, timeout=00:01:00
DEBUG 2020-06-13 15:47:40,744 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - MySqlDistributedLock resource=recurring-jobs:lock, timeout=00:01:00
DEBUG 2020-06-13 15:47:40,745 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Acquire resource=recurring-jobs:lock, timeout=00:01:00
DEBUG 2020-06-13 15:47:40,771 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Release resource=recurring-jobs:lock
DEBUG 2020-06-13 15:47:44,796 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - MySqlDistributedLock resource=recurring-jobs:lock, timeout=00:01:00
DEBUG 2020-06-13 15:47:44,796 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Acquire resource=recurring-jobs:lock, timeout=00:01:00
DEBUG 2020-06-13 15:47:44,821 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Release resource=recurring-jobs:lock
DEBUG 2020-06-13 15:47:48,845 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - MySqlDistributedLock resource=recurring-jobs:lock, timeout=00:01:00
DEBUG 2020-06-13 15:47:48,879 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Acquire resource=recurring-jobs:lock, timeout=00:01:00
DEBUG 2020-06-13 15:47:55,625 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Release resource=recurring-jobs:lock
DEBUG 2020-06-13 15:47:59,739 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - MySqlDistributedLock resource=recurring-jobs:lock, timeout=00:01:00
DEBUG 2020-06-13 15:47:59,739 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Acquire resource=recurring-jobs:lock, timeout=00:01:00
DEBUG 2020-06-13 15:48:00,036 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Release resource=recurring-jobs:lock
DEBUG 2020-06-13 15:48:04,107 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - MySqlDistributedLock resource=recurring-jobs:lock, timeout=00:01:00
DEBUG 2020-06-13 15:48:04,107 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Acquire resource=recurring-jobs:lock, timeout=00:01:00
DEBUG 2020-06-13 15:48:04,157 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Release resource=recurring-jobs:lock
DEBUG 2020-06-13 15:48:07,077 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Release resource=locks:schedulepoller
DEBUG 2020-06-13 15:48:08,208 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - MySqlDistributedLock resource=recurring-jobs:lock, timeout=00:01:00
DEBUG 2020-06-13 15:48:08,208 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Acquire resource=recurring-jobs:lock, timeout=00:01:00
DEBUG 2020-06-13 15:48:08,246 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Release resource=recurring-jobs:lock
INFO  2020-06-13 15:48:11,192 [er #1] Hangfire.Processing.BackgroundExecution  - Execution DelayedJobScheduler recovered from the Faulted state after 00:00:34.5317200 and is in the Running state now
DEBUG 2020-06-13 15:48:11,192 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - MySqlDistributedLock resource=locks:schedulepoller, timeout=00:01:00
DEBUG 2020-06-13 15:48:11,192 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Acquire resource=locks:schedulepoller, timeout=00:01:00
DEBUG 2020-06-13 15:48:11,228 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Release resource=locks:schedulepoller
DEBUG 2020-06-13 15:48:12,271 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - MySqlDistributedLock resource=recurring-jobs:lock, timeout=00:01:00
DEBUG 2020-06-13 15:48:12,271 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Acquire resource=recurring-jobs:lock, timeout=00:01:00
DEBUG 2020-06-13 15:48:12,329 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Release resource=recurring-jobs:lock
DEBUG 2020-06-13 15:48:12,271 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - MySqlDistributedLock resource=recurring-jobs:lock, timeout=00:01:00
DEBUG 2020-06-13 15:48:12,271 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Acquire resource=recurring-jobs:lock, timeout=00:01:00
DEBUG 2020-06-13 15:48:12,329 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Release resource=recurring-jobs:lock
DEBUG 2020-06-13 15:48:15,253 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - MySqlDistributedLock resource=locks:schedulepoller, timeout=00:01:00
DEBUG 2020-06-13 15:48:15,253 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Acquire resource=locks:schedulepoller, timeout=00:01:00
DEBUG 2020-06-13 15:48:15,519 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Release resource=locks:schedulepoller
DEBUG 2020-06-13 15:48:16,354 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - MySqlDistributedLock resource=recurring-jobs:lock, timeout=00:01:00
DEBUG 2020-06-13 15:48:16,354 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Acquire resource=recurring-jobs:lock, timeout=00:01:00
DEBUG 2020-06-13 15:48:16,412 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Release resource=recurring-jobs:lock
DEBUG 2020-06-13 15:48:19,570 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - MySqlDistributedLock resource=locks:schedulepoller, timeout=00:01:00
DEBUG 2020-06-13 15:48:19,570 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Acquire resource=locks:schedulepoller, timeout=00:01:00
DEBUG 2020-06-13 15:48:19,649 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Release resource=locks:schedulepoller
DEBUG 2020-06-13 15:48:20,437 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - MySqlDistributedLock resource=recurring-jobs:lock, timeout=00:01:00
DEBUG 2020-06-13 15:48:20,437 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Acquire resource=recurring-jobs:lock, timeout=00:01:00
DEBUG 2020-06-13 15:48:20,483 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Release resource=recurring-jobs:lock
DEBUG 2020-06-13 15:48:23,674 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - MySqlDistributedLock resource=locks:schedulepoller, timeout=00:01:00
DEBUG 2020-06-13 15:48:23,674 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Acquire resource=locks:schedulepoller, timeout=00:01:00
DEBUG 2020-06-13 15:48:23,699 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Release resource=locks:schedulepoller
DEBUG 2020-06-13 15:48:24,507 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - MySqlDistributedLock resource=recurring-jobs:lock, timeout=00:01:00
DEBUG 2020-06-13 15:48:24,507 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Acquire resource=recurring-jobs:lock, timeout=00:01:00
DEBUG 2020-06-13 15:48:24,582 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Release resource=recurring-jobs:lock
DEBUG 2020-06-13 15:48:27,775 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - MySqlDistributedLock resource=locks:schedulepoller, timeout=00:01:00
DEBUG 2020-06-13 15:48:27,775 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Acquire resource=locks:schedulepoller, timeout=00:01:00
DEBUG 2020-06-13 15:48:27,914 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Release resource=locks:schedulepoller
DEBUG 2020-06-13 15:48:28,607 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - MySqlDistributedLock resource=recurring-jobs:lock, timeout=00:01:00
DEBUG 2020-06-13 15:48:28,607 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Acquire resource=recurring-jobs:lock, timeout=00:01:00
DEBUG 2020-06-13 15:48:28,631 [er #1] Hangfire.MySql.Core.MySqlDistributedLock - Release resource=recurring-jobs:lock
odinserj commented 4 years ago

I see you are using MySql as a job storage which is a community-based extension. Also try to include the DEBUG log level - I see there's "Execution DelayedJobScheduler recovered from the Faulted state" message, and there also should be prior messages with DEBUG level with exact exception.

AelithBlanchett commented 4 years ago

After some digging in the official doc, I saw there were multiple MySql connectors.

This one in particular caught my eye because it's apparently fixing some deadlocks.

https://github.com/MiloszKrajewski/Hangfire.Storage.MySql

I'll try it whenever I'll have time to. I think it's worth a shot. Thanks and sorry in advance for wasting your time if that was the cause, @odinserj !

AelithBlanchett commented 4 years ago

Changing the connector to this one https://github.com/MiloszKrajewski/Hangfire.Storage.MySql seems to have fixed my issue. Haven't had to restart the server in a week.

skabariya commented 3 years ago

I am also having same problem, Jobs are en-queued but it's not processing

Using following version Hangfire.AspNetCore v1.6.29 Hangfire.MySqlStorage v2.0.3

Mysql storage configuration

services.AddHangfire(config => { config.UseStorage( new MySqlStorage("***connection string***", new MySqlStorageOptions { PrepareSchemaIfNecessary = false })); });

app.UseHangfireServer( new BackgroundJobServerOptions { WorkerCount = 1 });

Here is the error log

{ "@t": "2021-03-29T12:56:43.0892847Z", "@mt": "{State:l}", "@r": [ "Error occurred during execution of 'Worker #8a90b7c0' process. Execution will be retried (attempt #23) in 00:05:00 seconds." ], "@l": "Error", "@x": "MySqlConnector.MySqlException (0x80004005): The Command Timeout expired before the operation completed. ---> MySqlConnector.MySqlException (0x80004005): The Command Timeout expired before the operation completed.\n at MySqlConnector.Protocol.Serialization.BufferedByteReader.ReadBytesAsync(IByteHandler byteHandler, ArraySegment1 buffer, Int32 totalBytesToRead, IOBehavior ioBehavior) in //src/MySqlConnector/Protocol/Serialization/BufferedByteReader.cs:line 36\n at MySqlConnector.Protocol.Serialization.ProtocolUtility.ReadPacketAsync(BufferedByteReader bufferedByteReader, IByteHandler byteHandler, Func`1 getNextSequenceNumber, ProtocolErrorBehavior protocolErrorBehavior, IOBehavior ioBehavior) in //src/MySqlConnector/Protocol/Serialization/ProtocolUtility.cs:line 408\n at MySqlConnector.Protocol.Serialization.ProtocolUtility.DoReadPayloadAsync(BufferedByteReader bufferedByteReader, IByteHandler byteHandler, Func1 getNextSequenceNumber, ArraySegmentHolder1 previousPayloads, ProtocolErrorBehavior protocolErrorBehavior, IOBehavior ioBehavior) in //src/MySqlConnector/Protocol/Serialization/ProtocolUtility.cs:line 466\n at MySqlConnector.Protocol.Serialization.StandardPayloadHandler.ReadPayloadAsync(ArraySegmentHolder`1 cache, ProtocolErrorBehavior protocolErrorBehavior, IOBehavior ioBehavior) in //src/MySqlConnector/Protocol/Serialization/StandardPayloadHandler.cs:line 42\n at MySqlConnector.Core.ServerSession.ReceiveReplyAsync(IOBehavior ioBehavior, CancellationToken cancellationToken) in //src/MySqlConnector/Core/ServerSession.cs:line 765\n--- End of stack trace from previous location where exception was thrown ---\n at MySqlConnector.Core.ServerSession.ReceiveReplyAsyncAwaited(ValueTask`1 task) in //src/MySqlConnector/Core/ServerSession.cs:line 793\n at MySqlConnector.Core.ResultSet.ReadResultSetHeaderAsync(IOBehavior ioBehavior) in //src/MySqlConnector/Core/ResultSet.cs:line 49\n at MySqlConnector.MySqlDataReader.ActivateResultSet() in //src/MySqlConnector/MySqlDataReader.cs:line 115\n at MySqlConnector.MySqlDataReader.CreateAsync(CommandListPosition commandListPosition, ICommandPayloadCreator payloadCreator, IDictionary2 cachedProcedures, IMySqlCommand command, CommandBehavior behavior, IOBehavior ioBehavior, CancellationToken cancellationToken) in /_/src/MySqlConnector/MySqlDataReader.cs:line 422\n at MySqlConnector.Core.CommandExecutor.ExecuteReaderAsync(IReadOnlyList1 commands, ICommandPayloadCreator payloadCreator, CommandBehavior behavior, IOBehavior ioBehavior, CancellationToken cancellationToken) in //src/MySqlConnector/Core/CommandExecutor.cs:line 61\n at MySqlConnector.MySqlCommand.ExecuteNonQueryAsync(IOBehavior ioBehavior, CancellationToken cancellationToken) in //src/MySqlConnector/MySqlCommand.cs:line 263\n at MySqlConnector.MySqlCommand.ExecuteNonQuery() in /_/src/MySqlConnector/MySqlCommand.cs:line 100\n at Dapper.SqlMapper.ExecuteCommand(IDbConnection cnn, CommandDefinition& command, Action2 paramReader) in C:\\projects\\dapper\\Dapper\\SqlMapper.cs:line 2827\n at Dapper.SqlMapper.ExecuteImpl(IDbConnection cnn, CommandDefinition& command) in C:\\projects\\dapper\\Dapper\\SqlMapper.cs:line 570\n at Hangfire.MySql.JobQueue.MySqlJobQueue.Dequeue(String[] queues, CancellationToken cancellationToken)\n at Hangfire.MySql.MySqlStorageConnection.FetchNextJob(String[] queues, CancellationToken cancellationToken)\n at Hangfire.Server.Worker.Execute(BackgroundProcessContext context)\n at Hangfire.Server.AutomaticRetryProcess.Execute(BackgroundProcessContext context)", "State": "Error occurred during execution of 'Worker #8a90b7c0' process. Execution will be retried (attempt #23) in 00:05:00 seconds.", "SourceContext": "Hangfire.Server.Worker" } Screenshot from 2021-03-29 18-35-47

Any help appreciated

Gheri commented 3 years ago

It seems like it is fetching the job and enqueuing the job but Why then the job is moved to Failed queue if processing of job is failed. If its production environment and clients are waiting for enqueued job, you can try restart server and it might start processing the jobs but issue still has to be fixed.

skabariya commented 3 years ago

It seems like it is fetching the job and enqueuing the job but Why then the job is moved to Failed queue if processing of job is failed. If its production environment and clients are waiting for enqueued job, you can try restart server and it might start processing the jobs but issue still has to be fixed.

@Gheri Thanks for the replay, I already tried this and even after restarting server the same issue occurs.

Forreststruggle commented 3 years ago

I am also having the same problem, I already change to hangfire.storage.mysql. The problem still exist.

kostebudinoski commented 3 years ago

It's still happening for us, with Hangfire version 1.7.25 using redis storage with Hangfire Pro 2.8.10. Jobs got enqueued but never picked up to be processed.

Here's the output of running stdump on the server process:

Stack Trace Dump 0.0.9

Found CLR 5.0.421.11614

Following AppDomains found:

AppDomain #1
  Address:       1433603692432
  Name:          clrhost
  Configuration: n/a
  Directory:     n/a

Following threads found:

Thread #1
  OS Thread ID:      12956
  AppDomain Address: 1433603692432
  Type:              Foreground

  Managed stack trace:
   - [InlinedCallFrame] (Interop+Advapi32.StartServiceCtrlDispatcher) at System.ServiceProcess.ServiceController.dll
   - [InlinedCallFrame] (Interop+Advapi32.StartServiceCtrlDispatcher) at System.ServiceProcess.ServiceController.dll
   - ILStubClass.IL_STUB_PInvoke(IntPtr) at System.Console.dll
   - System.ServiceProcess.ServiceBase.Run(System.ServiceProcess.ServiceBase[]) at System.ServiceProcess.ServiceController.dll
   - System.ServiceProcess.ServiceBase.Run(System.ServiceProcess.ServiceBase) at System.ServiceProcess.ServiceController.dll
   - Topshelf.Runtime.Windows.WindowsServiceHost.Run() at Topshelf.dll
   - Topshelf.HostFactory.Run(System.Action`1<Topshelf.HostConfigurators.HostConfigurator>) at Topshelf.dll
   - App.Program.Main(System.String[]) at App.dll

Thread #2
  OS Thread ID:      6364
  AppDomain Address: 1433603692432
  Type:              Background
  Role:              Finalizer

  No managed stack trace found for this thread.

Thread #3
  OS Thread ID:      6872
  AppDomain Address: 1433603692432
  Type:              Background

  No managed stack trace found for this thread.

Thread #10
  OS Thread ID:      12324
  AppDomain Address: 1433603692432
  Type:              Background

  Managed stack trace:
   - [HelperMethodFrame] at
   - System.Threading.WaitHandle.WaitOneNoCheck(Int32) at System.Private.CoreLib.dll
   - System.Threading.WaitHandle.WaitOne(System.TimeSpan) at System.Private.CoreLib.dll
   - Hangfire.Pro.Redis.CancellationTokenExtentions.Wait(System.Threading.CancellationToken, System.TimeSpan) at Hangfire.Pro.Redis.dll
   - Hangfire.Pro.Redis.RedisStorage.ExtendLocks() at Hangfire.Pro.Redis.dll
   - System.Threading.ThreadHelper.ThreadStart_Context(System.Object) at System.Private.CoreLib.dll
   - System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) at System.Private.CoreLib.dll
   - System.Threading.ThreadHelper.ThreadStart() at System.Private.CoreLib.dll
   - [DebuggerU2MCatchHandlerFrame] at

Thread #11
  OS Thread ID:      10440
  AppDomain Address: 1433603692432
  Type:              Background

  Managed stack trace:
   - [HelperMethodFrame] at
   - System.Threading.WaitHandle.WaitMultiple(System.ReadOnlySpan`1<System.Threading.WaitHandle>, Boolean, Int32) at System.Private.CoreLib.dll
   - System.Threading.WaitHandle.WaitMultiple(System.Threading.WaitHandle[], Boolean, Int32) at System.Private.CoreLib.dll
   - Hangfire.Pro.Redis.RedisStorage.ConnectLoop() at Hangfire.Pro.Redis.dll
   - System.Threading.ThreadHelper.ThreadStart_Context(System.Object) at System.Private.CoreLib.dll
   - System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) at System.Private.CoreLib.dll
   - System.Threading.ThreadHelper.ThreadStart() at System.Private.CoreLib.dll
   - [DebuggerU2MCatchHandlerFrame] at

Thread #13
  OS Thread ID:      9596
  AppDomain Address: 1433603692432
  Type:              Background

  Managed stack trace:
   - [HelperMethodFrame_1OBJ] at
   - StackExchange.Redis.SocketManager.WriteAllQueues() at Hangfire.Pro.Redis.dll
   - StackExchange.Redis.SocketManager+<>c.<.cctor>b__29_0(System.Object) at Hangfire.Pro.Redis.dll
   - System.Threading.ThreadHelper.ThreadStart_Context(System.Object) at System.Private.CoreLib.dll
   - System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) at System.Private.CoreLib.dll
   - System.Threading.ThreadHelper.ThreadStart(System.Object) at System.Private.CoreLib.dll
   - [DebuggerU2MCatchHandlerFrame] at

Thread #22
  OS Thread ID:      10584
  AppDomain Address: 1433603692432
  Type:              Background

  Managed stack trace:
   - [InlinedCallFrame] (Interop+Winsock.recv) at System.Net.Sockets.dll
   - [InlinedCallFrame] (Interop+Winsock.recv) at System.Net.Sockets.dll
   - ILStubClass.IL_STUB_PInvoke(System.Net.Sockets.SafeSocketHandle, Byte*, Int32, System.Net.Sockets.SocketFlags) at System.Console.dll
   - System.Net.Sockets.SocketPal.Receive(System.Net.Sockets.SafeSocketHandle, System.Span`1<Byte>, System.Net.Sockets.SocketFlags, Int32 ByRef) at System.Net.Sockets.dll
   - System.Net.Sockets.Socket.Receive(System.Span`1<Byte>, System.Net.Sockets.SocketFlags, System.Net.Sockets.SocketError ByRef) at System.Net.Sockets.dll
   - System.Net.Sockets.NetworkStream.Read(System.Span`1<Byte>) at System.Net.Sockets.dll
   - System.Net.Security.SslStream+<ReadAsyncInternal>d__181`1[[System.Net.Security.SyncReadWriteAdapter, System.Net.Security]].MoveNext() at System.Net.Security.dll
   - System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[System.Net.Security.SslStream+<ReadAsyncInternal>d__181`1[[System.Net.Security.SyncReadWriteAdapter, System.Net.Security]], System.Net.Security]](<ReadAsyncInternal>d__181`1<System.Net.Security.SyncReadWriteAdapter> ByRef) at System.Private.CoreLib.dll
   - System.Net.Security.SslStream.ReadAsyncInternal[[System.Net.Security.SyncReadWriteAdapter, System.Net.Security]](System.Net.Security.SyncReadWriteAdapter, System.Memory`1<Byte>) at System.Net.Security.dll
   - System.Net.Security.SslStream.Read(Byte[], Int32, Int32) at System.Net.Security.dll
   - StackExchange.Redis.PhysicalConnection.StackExchange.Redis.ISocketCallback.Read() at Hangfire.Pro.Redis.dll
   - StackExchange.Redis.SocketManager+<>c.<.cctor>b__29_2(System.Object) at Hangfire.Pro.Redis.dll
   - System.Threading.ThreadHelper.ThreadStart_Context(System.Object) at System.Private.CoreLib.dll
   - System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) at System.Private.CoreLib.dll
   - System.Threading.ThreadHelper.ThreadStart(System.Object) at System.Private.CoreLib.dll
   - [DebuggerU2MCatchHandlerFrame] at

Thread #23
  OS Thread ID:      8328
  AppDomain Address: 1433603692432
  Type:              Background

  Managed stack trace:
   - [InlinedCallFrame] (Interop+Winsock.recv) at System.Net.Sockets.dll
   - [InlinedCallFrame] (Interop+Winsock.recv) at System.Net.Sockets.dll
   - ILStubClass.IL_STUB_PInvoke(System.Net.Sockets.SafeSocketHandle, Byte*, Int32, System.Net.Sockets.SocketFlags) at System.Console.dll
   - System.Net.Sockets.SocketPal.Receive(System.Net.Sockets.SafeSocketHandle, System.Span`1<Byte>, System.Net.Sockets.SocketFlags, Int32 ByRef) at System.Net.Sockets.dll
   - System.Net.Sockets.Socket.Receive(System.Span`1<Byte>, System.Net.Sockets.SocketFlags, System.Net.Sockets.SocketError ByRef) at System.Net.Sockets.dll
   - System.Net.Sockets.NetworkStream.Read(System.Span`1<Byte>) at System.Net.Sockets.dll
   - System.Net.Security.SslStream+<ReadAsyncInternal>d__181`1[[System.Net.Security.SyncReadWriteAdapter, System.Net.Security]].MoveNext() at System.Net.Security.dll
   - System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[System.Net.Security.SslStream+<ReadAsyncInternal>d__181`1[[System.Net.Security.SyncReadWriteAdapter, System.Net.Security]], System.Net.Security]](<ReadAsyncInternal>d__181`1<System.Net.Security.SyncReadWriteAdapter> ByRef) at System.Private.CoreLib.dll
   - System.Net.Security.SslStream.ReadAsyncInternal[[System.Net.Security.SyncReadWriteAdapter, System.Net.Security]](System.Net.Security.SyncReadWriteAdapter, System.Memory`1<Byte>) at System.Net.Security.dll
   - System.Net.Security.SslStream.Read(Byte[], Int32, Int32) at System.Net.Security.dll
   - StackExchange.Redis.PhysicalConnection.StackExchange.Redis.ISocketCallback.Read() at Hangfire.Pro.Redis.dll
   - StackExchange.Redis.SocketManager+<>c.<.cctor>b__29_2(System.Object) at Hangfire.Pro.Redis.dll
   - System.Threading.ThreadHelper.ThreadStart_Context(System.Object) at System.Private.CoreLib.dll
   - System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) at System.Private.CoreLib.dll
   - System.Threading.ThreadHelper.ThreadStart(System.Object) at System.Private.CoreLib.dll
   - [DebuggerU2MCatchHandlerFrame] at

Thread #25
  OS Thread ID:      9952
  AppDomain Address: 1433603692432
  Type:              Background

  Managed stack trace:
   - [HelperMethodFrame_1OBJ] at
   - System.Threading.ManualResetEventSlim.Wait(Int32, System.Threading.CancellationToken) at System.Private.CoreLib.dll
   - System.Threading.Tasks.Task.WaitAllBlockingCore(System.Collections.Generic.List`1<System.Threading.Tasks.Task>, Int32, System.Threading.CancellationToken) at System.Private.CoreLib.dll
   - System.Threading.Tasks.Task.WaitAllCore(System.Threading.Tasks.Task[], Int32, System.Threading.CancellationToken) at System.Private.CoreLib.dll
   - Hangfire.Server.BackgroundServerProcess.WaitForDispatchers(Hangfire.Server.BackgroundServerContext, System.Collections.Generic.IReadOnlyList`1<Hangfire.Processing.IBackgroundDispatcher>) at Hangfire.Core.dll
   - Hangfire.Server.BackgroundServerProcess.Execute(System.Guid, Hangfire.Processing.BackgroundExecution, System.Threading.CancellationToken, System.Threading.CancellationToken, System.Threading.CancellationToken) at Hangfire.Core.dll
   - Hangfire.Server.BackgroundProcessingServer.RunServer(System.Guid, System.Object) at Hangfire.Core.dll
   - Hangfire.Processing.BackgroundExecution.Run(System.Action`2<System.Guid,System.Object>, System.Object) at Hangfire.Core.dll
   - Hangfire.Processing.BackgroundDispatcher.DispatchLoop() at Hangfire.Core.dll
   - System.Threading.ThreadHelper.ThreadStart_Context(System.Object) at System.Private.CoreLib.dll
   - System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) at System.Private.CoreLib.dll
   - System.Threading.ThreadHelper.ThreadStart() at System.Private.CoreLib.dll
   - [DebuggerU2MCatchHandlerFrame] at

Thread #26
  OS Thread ID:      5460
  AppDomain Address: 1433603692432
  Type:              Background

  Managed stack trace:
   - [HelperMethodFrame] at
   - System.Threading.WaitHandle.WaitOneNoCheck(Int32) at System.Private.CoreLib.dll
   - System.Threading.WaitHandle.WaitOne(System.TimeSpan) at System.Private.CoreLib.dll
   - Hangfire.Common.CancellationTokenExtentions.Wait(System.Threading.CancellationToken, System.TimeSpan) at Hangfire.Core.dll
   - Hangfire.Server.ServerHeartbeatProcess.Execute(Hangfire.Server.BackgroundProcessContext) at Hangfire.Core.dll
   - Hangfire.Server.BackgroundProcessDispatcherBuilder.ExecuteProcess(System.Guid, System.Object) at Hangfire.Core.dll
   - Hangfire.Processing.BackgroundExecution.Run(System.Action`2<System.Guid,System.Object>, System.Object) at Hangfire.Core.dll
   - Hangfire.Processing.BackgroundDispatcher.DispatchLoop() at Hangfire.Core.dll
   - System.Threading.ThreadHelper.ThreadStart_Context(System.Object) at System.Private.CoreLib.dll
   - System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) at System.Private.CoreLib.dll
   - System.Threading.ThreadHelper.ThreadStart() at System.Private.CoreLib.dll
   - [DebuggerU2MCatchHandlerFrame] at

Thread #27
  OS Thread ID:      12292
  AppDomain Address: 1433603692432
  Type:              Background

  Managed stack trace:
   - [HelperMethodFrame] at
   - System.Threading.WaitHandle.WaitOneNoCheck(Int32) at System.Private.CoreLib.dll
   - System.Threading.WaitHandle.WaitOne(System.TimeSpan) at System.Private.CoreLib.dll
   - Hangfire.Common.CancellationTokenExtentions.Wait(System.Threading.CancellationToken, System.TimeSpan) at Hangfire.Core.dll
   - Hangfire.Server.ServerWatchdog.Execute(Hangfire.Server.BackgroundProcessContext) at Hangfire.Core.dll
   - Hangfire.Server.BackgroundProcessDispatcherBuilder.ExecuteProcess(System.Guid, System.Object) at Hangfire.Core.dll
   - Hangfire.Processing.BackgroundExecution.Run(System.Action`2<System.Guid,System.Object>, System.Object) at Hangfire.Core.dll
   - Hangfire.Processing.BackgroundDispatcher.DispatchLoop() at Hangfire.Core.dll
   - System.Threading.ThreadHelper.ThreadStart_Context(System.Object) at System.Private.CoreLib.dll
   - System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) at System.Private.CoreLib.dll
   - System.Threading.ThreadHelper.ThreadStart() at System.Private.CoreLib.dll
   - [DebuggerU2MCatchHandlerFrame] at

Thread #28
  OS Thread ID:      2032
  AppDomain Address: 1433603692432
  Type:              Background

  Managed stack trace:
   - [HelperMethodFrame] at
   - System.Threading.WaitHandle.WaitOneNoCheck(Int32) at System.Private.CoreLib.dll
   - System.Threading.WaitHandle.WaitOne(System.TimeSpan) at System.Private.CoreLib.dll
   - Hangfire.Common.CancellationTokenExtentions.Wait(System.Threading.CancellationToken, System.TimeSpan) at Hangfire.Core.dll
   - Hangfire.Server.ServerJobCancellationWatcher.Execute(Hangfire.Server.BackgroundProcessContext) at Hangfire.Core.dll
   - Hangfire.Server.BackgroundProcessDispatcherBuilder.ExecuteProcess(System.Guid, System.Object) at Hangfire.Core.dll
   - Hangfire.Processing.BackgroundExecution.Run(System.Action`2<System.Guid,System.Object>, System.Object) at Hangfire.Core.dll
   - Hangfire.Processing.BackgroundDispatcher.DispatchLoop() at Hangfire.Core.dll
   - System.Threading.ThreadHelper.ThreadStart_Context(System.Object) at System.Private.CoreLib.dll
   - System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) at System.Private.CoreLib.dll
   - System.Threading.ThreadHelper.ThreadStart() at System.Private.CoreLib.dll
   - [DebuggerU2MCatchHandlerFrame] at

Thread #29
  OS Thread ID:      5080
  AppDomain Address: 1433603692432
  Type:              Background

  Managed stack trace:
   - [HelperMethodFrame] at
   - System.Threading.WaitHandle.WaitOneNoCheck(Int32) at System.Private.CoreLib.dll
   - System.Threading.WaitHandle.WaitOne(System.TimeSpan) at System.Private.CoreLib.dll
   - Hangfire.Pro.Redis.CancellationTokenExtentions.Wait(System.Threading.CancellationToken, System.TimeSpan) at Hangfire.Pro.Redis.dll
   - Hangfire.Pro.Redis.FetchedJobsWatcher.Execute(System.Threading.CancellationToken) at Hangfire.Pro.Redis.dll
   - Hangfire.Processing.BackgroundExecution.Run(System.Action`2<System.Guid,System.Object>, System.Object) at Hangfire.Core.dll
   - Hangfire.Processing.BackgroundDispatcher.DispatchLoop() at Hangfire.Core.dll
   - System.Threading.ThreadHelper.ThreadStart_Context(System.Object) at System.Private.CoreLib.dll
   - System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) at System.Private.CoreLib.dll
   - System.Threading.ThreadHelper.ThreadStart() at System.Private.CoreLib.dll
   - [DebuggerU2MCatchHandlerFrame] at

Thread #30
  OS Thread ID:      10204
  AppDomain Address: 1433603692432
  Type:              Background

  Managed stack trace:
   - [HelperMethodFrame_1OBJ] at
   - System.Threading.SemaphoreSlim.WaitUntilCountOrTimeout(Int32, UInt32, System.Threading.CancellationToken) at System.Private.CoreLib.dll
   - System.Threading.SemaphoreSlim.Wait(Int32, System.Threading.CancellationToken) at System.Private.CoreLib.dll
   - Hangfire.Pro.Redis.RedisConnection.FetchNextJob(System.String[], System.Threading.CancellationToken) at Hangfire.Pro.Redis.dll
   - Hangfire.Server.Worker.Execute(Hangfire.Server.BackgroundProcessContext) at Hangfire.Core.dll
   - Hangfire.Server.BackgroundProcessDispatcherBuilder.ExecuteProcess(System.Guid, System.Object) at Hangfire.Core.dll
   - Hangfire.Processing.BackgroundExecution.Run(System.Action`2<System.Guid,System.Object>, System.Object) at Hangfire.Core.dll
   - Hangfire.Processing.BackgroundDispatcher.DispatchLoop() at Hangfire.Core.dll
   - System.Threading.ThreadHelper.ThreadStart_Context(System.Object) at System.Private.CoreLib.dll
   - System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) at System.Private.CoreLib.dll
   - System.Threading.ThreadHelper.ThreadStart() at System.Private.CoreLib.dll
   - [DebuggerU2MCatchHandlerFrame] at

Thread #31
  OS Thread ID:      11088
  AppDomain Address: 1433603692432
  Type:              Background

  Managed stack trace:
   - [HelperMethodFrame] at
   - System.Threading.WaitHandle.WaitMultiple(System.ReadOnlySpan`1<System.Threading.WaitHandle>, Boolean, Int32) at System.Private.CoreLib.dll
   - System.Threading.WaitHandle.WaitMultiple(System.Threading.WaitHandle[], Boolean, Int32) at System.Private.CoreLib.dll
   - Hangfire.Pro.Redis.RedisConnection.TrySubscriptionBasedFetch(System.String[], System.Threading.CancellationToken) at Hangfire.Pro.Redis.dll
   - Hangfire.Pro.Redis.RedisConnection.FetchNextJob(System.String[], System.Threading.CancellationToken) at Hangfire.Pro.Redis.dll
   - Hangfire.Server.Worker.Execute(Hangfire.Server.BackgroundProcessContext) at Hangfire.Core.dll
   - Hangfire.Server.BackgroundProcessDispatcherBuilder.ExecuteProcess(System.Guid, System.Object) at Hangfire.Core.dll
   - Hangfire.Processing.BackgroundExecution.Run(System.Action`2<System.Guid,System.Object>, System.Object) at Hangfire.Core.dll
   - Hangfire.Processing.BackgroundDispatcher.DispatchLoop() at Hangfire.Core.dll
   - System.Threading.ThreadHelper.ThreadStart_Context(System.Object) at System.Private.CoreLib.dll
   - System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) at System.Private.CoreLib.dll
   - System.Threading.ThreadHelper.ThreadStart() at System.Private.CoreLib.dll
   - [DebuggerU2MCatchHandlerFrame] at

Thread #32
  OS Thread ID:      12256
  AppDomain Address: 1433603692432
  Type:              Background

  No managed stack trace found for this thread.

Thread #34
  OS Thread ID:      9048
  AppDomain Address: 1433603692432
  Type:              Background

  Managed stack trace:
   - [InlinedCallFrame] at
   - [InlinedCallFrame] at
   - ILStubClass.IL_STUB_PInvoke(Grpc.Core.Internal.CompletionQueueSafeHandle) at System.Console.dll
   - Grpc.Core.Internal.GrpcThreadPool.RunHandlerLoop(Grpc.Core.Internal.CompletionQueueSafeHandle, Grpc.Core.Profiling.IProfiler) at Grpc.Core.dll
   - Grpc.Core.Internal.GrpcThreadPool+<>c__DisplayClass22_0.<CreateAndStartThread>b__0() at Grpc.Core.dll
   - System.Threading.ThreadHelper.ThreadStart_Context(System.Object) at System.Private.CoreLib.dll
   - System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) at System.Private.CoreLib.dll
   - System.Threading.ThreadHelper.ThreadStart() at System.Private.CoreLib.dll
   - [DebuggerU2MCatchHandlerFrame] at

Thread #35
  OS Thread ID:      9976
  AppDomain Address: 1433603692432
  Type:              Background

  Managed stack trace:
   - [InlinedCallFrame] at
   - [InlinedCallFrame] at
   - ILStubClass.IL_STUB_PInvoke(Grpc.Core.Internal.CompletionQueueSafeHandle) at System.Console.dll
   - Grpc.Core.Internal.GrpcThreadPool.RunHandlerLoop(Grpc.Core.Internal.CompletionQueueSafeHandle, Grpc.Core.Profiling.IProfiler) at Grpc.Core.dll
   - Grpc.Core.Internal.GrpcThreadPool+<>c__DisplayClass22_0.<CreateAndStartThread>b__0() at Grpc.Core.dll
   - System.Threading.ThreadHelper.ThreadStart_Context(System.Object) at System.Private.CoreLib.dll
   - System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) at System.Private.CoreLib.dll
   - System.Threading.ThreadHelper.ThreadStart() at System.Private.CoreLib.dll
   - [DebuggerU2MCatchHandlerFrame] at

Thread #36
  OS Thread ID:      9652
  AppDomain Address: 1433603692432
  Type:              Background

  Managed stack trace:
   - [InlinedCallFrame] at
   - [InlinedCallFrame] at
   - ILStubClass.IL_STUB_PInvoke(Grpc.Core.Internal.CompletionQueueSafeHandle) at System.Console.dll
   - Grpc.Core.Internal.GrpcThreadPool.RunHandlerLoop(Grpc.Core.Internal.CompletionQueueSafeHandle, Grpc.Core.Profiling.IProfiler) at Grpc.Core.dll
   - Grpc.Core.Internal.GrpcThreadPool+<>c__DisplayClass22_0.<CreateAndStartThread>b__0() at Grpc.Core.dll
   - System.Threading.ThreadHelper.ThreadStart_Context(System.Object) at System.Private.CoreLib.dll
   - System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) at System.Private.CoreLib.dll
   - System.Threading.ThreadHelper.ThreadStart() at System.Private.CoreLib.dll
   - [DebuggerU2MCatchHandlerFrame] at

Thread #37
  OS Thread ID:      6092
  AppDomain Address: 1433603692432
  Type:              Background

  Managed stack trace:
   - [InlinedCallFrame] at
   - [InlinedCallFrame] at
   - ILStubClass.IL_STUB_PInvoke(Grpc.Core.Internal.CompletionQueueSafeHandle) at System.Console.dll
   - Grpc.Core.Internal.GrpcThreadPool.RunHandlerLoop(Grpc.Core.Internal.CompletionQueueSafeHandle, Grpc.Core.Profiling.IProfiler) at Grpc.Core.dll
   - Grpc.Core.Internal.GrpcThreadPool+<>c__DisplayClass22_0.<CreateAndStartThread>b__0() at Grpc.Core.dll
   - System.Threading.ThreadHelper.ThreadStart_Context(System.Object) at System.Private.CoreLib.dll
   - System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) at System.Private.CoreLib.dll
   - System.Threading.ThreadHelper.ThreadStart() at System.Private.CoreLib.dll
   - [DebuggerU2MCatchHandlerFrame] at

Thread #39
  OS Thread ID:      848
  AppDomain Address: 1433603692432
  Type:              Background

  No managed stack trace found for this thread.

Thread #14
  OS Thread ID:      4916
  AppDomain Address: 1433603692432
  Type:              Background

  No managed stack trace found for this thread.

Thread #62
  OS Thread ID:      5384
  AppDomain Address: 1433603692432
  Type:              Background

  No managed stack trace found for this thread.

Thread #61
  OS Thread ID:      3680
  AppDomain Address: 1433603692432
  Type:              Background

  No managed stack trace found for this thread.

Thread #70
  OS Thread ID:      9728
  AppDomain Address: 1433603692432
  Type:              Background

  No managed stack trace found for this thread.

Thread #57
  OS Thread ID:      8980
  AppDomain Address: 1433603692432
  Type:              Background

  No managed stack trace found for this thread.

Thread #56
  OS Thread ID:      7872
  AppDomain Address: 1433603692432
  Type:              Background

  No managed stack trace found for this thread.

image

EIDT: it's strange, if I click on the requeue button, the job will be processed, but all future enqueue jobs are pending again in this state.

kostebudinoski commented 3 years ago

Any support would be highly appreciated @odinserj

odinserj commented 3 years ago

Hm, looks very strange, especially when everything is restored after Requeue button is pushed. I need to understand what happens – it's either publish event is not recognised or queue doesn't contain the background job identifier. Is it possible to run the following command (where default is the queue name and hangfire: is the configured prefix)?

lrange hangfire:queue:default 0 -1

If it returns an empty list then it's either key name is misspelled or background job identifier is not in the queue. If not, then something strange happens with event publication.

Please also send me your configuration code related to Hangfire and the output of the INFO command issued to Redis to also understand your Redis configuration.

kostebudinoski commented 3 years ago

Thanks for your reply!

Here are for example two enqueued jobs in the dashboard:

image

and I can see them in the Redis storage with the proper *queue key:

image

and here is an example, that the jobs get processed as soon as I click on the Requeue button (at this point both pending jobs were processed, although I clicked Requeue on just one of the pending one):

image

Will send you next, during the day, the configuration code related to Hangfire. Please share any input you have so far.

kostebudinoski commented 3 years ago

Here is the configuration code related to hangfire (we use Autofac DI container):

  builder.Register(x =>
  {
      var configuration = x.Resolve<IRedisConfiguration>();

      return (JobStorage)new RedisStorage(configuration.ConfigurationOptionsString, new RedisStorageOptions
      {
          Prefix = "OurPrefixJobs",
          InvisibilityTimeout = configuration.InvisibilityTimeout
      });
  })
  .As<JobStorage>()
  .SingleInstance();

  builder.RegisterType<HangfireJobActivator>().As<JobActivator>().SingleInstance();

  builder.Register(
          x =>
          {
              var configuration = x.Resolve<IBackgroundProcessingConfiguration>();

              return new WorkersOnlyBackgroundJobServer(
                  new BackgroundJobServerOptions
                  {
                      Queues = new[] { QueueNames.OurQueueName },
                      WorkerCount = configuration.WorkerCount,
                      ServerTimeout = TimeSpan.FromHours(1.0 * 24.0),
                      Activator = x.Resolve<JobActivator>()
                  },
                  x.Resolve<JobStorage>());
          })
      .As<WorkersOnlyBackgroundJobServer>()
      .SingleInstance();

Here is the HanfgireJobActivator used in the DI registrations:

public class HangfireJobActivator : JobActivator
{
    private readonly ILifetimeScope _lifetimeScope;

    public HangfireJobActivator(ILifetimeScope lifetimeScope)
    {
        _lifetimeScope = lifetimeScope;
    }

    public override object ActivateJob(Type type)
    {
        return _lifetimeScope.Resolve(type);
    }

    public override JobActivatorScope BeginScope(JobActivatorContext context)
    {
        return new HangfireJobActivatorScope(_lifetimeScope.BeginLifetimeScope());
    }
}

public class HangfireJobActivatorScope : JobActivatorScope
{
    private readonly ILifetimeScope _scope;

    public HangfireJobActivatorScope(ILifetimeScope scope)
    {
        _scope = scope;
    }

    public override object Resolve(Type type)
    {
        return _scope.Resolve(type);
    }

    public override void DisposeScope()
    {
        _scope.Dispose();
    }
}

Here is the WorkersOnlyBackgroundJobServer used in the DI registrations:

 public class WorkersOnlyBackgroundJobServer : IDisposable
  {
      private static readonly ILog Logger = LogProvider.For<BackgroundJobServer>();
      private readonly BackgroundProcessingServer _processingServer;

      public WorkersOnlyBackgroundJobServer(BackgroundJobServerOptions options, JobStorage storage)
      {
          Logger.Info("Starting Hangfire Server");
          Logger.Info($"Using job storage: '{storage}'");
          storage.WriteOptionsToLog(Logger);
          Logger.Info("Using the following options for Hangfire Server:");
          Logger.Info($"    Worker count: {options.WorkerCount}");
          Logger.Info($"    Listening queues: {string.Join(", ", options.Queues.Select(x => "'" + x + "'"))}");
          Logger.Info($"    Shutdown timeout: {options.ShutdownTimeout}");
          Logger.Info($"    Schedule polling interval: {options.SchedulePollingInterval}");

          _processingServer = new BackgroundProcessingServer(
              storage,
              GetRequiredProcesses(options),
              new Dictionary<string, object>
              {
                  {"Queues", options.Queues},
                  {"WorkerCount", options.WorkerCount}
              },
              new BackgroundProcessingServerOptions
              {
                  ShutdownTimeout = options.ShutdownTimeout,
                  HeartbeatInterval = options.HeartbeatInterval,
                  ServerCheckInterval = options.ServerCheckInterval,
                  ServerTimeout = options.ServerTimeout,
                  ServerName = options.ServerName
              });
      }

      public void SendStop()
      {
          Logger.Debug("Hangfire Server is stopping...");

          _processingServer.SendStop();
      }

      public void Dispose()
      {
          _processingServer.Dispose();

          Logger.Info("Hangfire Server stopped.");
      }

      private static IEnumerable<IBackgroundProcess> GetRequiredProcesses(BackgroundJobServerOptions options)
      {
          var backgroundProcessList = new List<IBackgroundProcess>();
          var filterProvider = options.FilterProvider ?? JobFilterProviders.Providers;

          var backgroundJobPerformer = new BackgroundJobPerformer(filterProvider, options.Activator ?? JobActivator.Current);
          var backgroundJobStateChanger = new BackgroundJobStateChanger(filterProvider);

          for (var index = 0; index < options.WorkerCount; ++index)
          {
              backgroundProcessList.Add(new Worker(options.Queues, backgroundJobPerformer, backgroundJobStateChanger));
          }

          return backgroundProcessList;
      }
  }

Later, the client just enqueues the job to the given queue:

_backgroundJobClient.Enqueue<IOurParsingJob>(x => x.Process(new OurJobContext
  {
      // props here
  }));

Where BackgroundJobClient is single instance and yes the job interface is the same (reused via common package).

We use single Redis instance (no cluster).

Used packages in the processing worker:

 <PackageReference Include="Hangfire.Core" Version="1.7.25" />
 <PackageReference Include="Hangfire.Pro.Redis" Version="2.8.10" />
kostebudinoski commented 3 years ago

@odinserj safe yourself some time reading all this :)

I checked the client that enqueues the jobs now and figured it out where the problem is:

<PackageReference Include="Hangfire.Redis.StackExchange" Version="1.8.4" />

just updated to

<PackageReference Include="Hangfire.Pro.Redis" Version="2.8.10" />

and the issue is gone :)

Thanks for your time and the great package! And sorry for the confusion.

odinserj commented 3 years ago

Oh, Hangfire.Redis.StackExchange and Hangfire.Pro.Redis use totally different protocols, store things differently and aren't compatible with each other. So you are right, that's the actual reason, thanks for sharing!

brunoravera commented 2 years ago

Just in case some still facing this, I had a similar issue but my problem was that I didn't defined my queues names in the startup. Adding this solved the problem

app.UseHangfireServer(new BackgroundJobServerOptions() { Queues = new[] { "emails_queue" }, WorkerCount = 20 });

odinserj commented 2 years ago

I'm closing and locking this issue now, because almost any kind of problem in background processing will lead to the symptom "Jobs are enqueued but not processing", and more specified details required.