Open onetsmr opened 3 years ago
Thanks for the screenshots and profiling. Please note that Hangfire.Pro.Redis doesn't use Pipelines.Sockets.Unofficial, because based on the internalised custom branch of StackExchange.Redis 1.2.6 (so it doesn't matter what version is specified in the project itself). Also, CancellationTokenExtensions.Wait
method is based on WaitHandle.WaitOne
method that doesn't consume the CPU itself (despite shown in the profiler).
I'm more interested in those Thread Pool threads as they show a lot of activities. Is it still possible to examine their top methods and send some screenshots?
Also could you please show inner details for the following call?
We downloaded and used Hagfire.Pro.Redis, not Hangfire.Pro.Redis.SEv2. I will try SEv2.
Thread Pools has 77% native code
RunWorkLoop also has calling ntdll
This is IIS 10.0 running on 64bit Windows Server 2016. Web service .net framework 4.6
No, please don't use SEv2 – it's based on StackExchange.Redis 2.X that may be the reason of the issue with high CPU. Hangfire.Pro.Redis is based on the previous version of Redis client that doesn't use Sockets.Pipelines.Unifficial so at least we can isolate things.
Could you tell me what does your recurring job do – I assume it talks with Redis, right?
Recurring job runs a few child jobs (10-100) based on tenants from the sql database. Each child job goes to the sql database, takes 10 records, processes them and re run itself while database has rows for processing. Both types of jobs uses RedLockNet for business locks. And hangfire and RedLockNet goes to the same redis cluster (3 nodes), but uses different keys.
As I understand, I should try to remove things one by one and watch how the situation has changed?
Clean recurring job - cpu 1-2% Clean recurring job with redlock - cpu 1-2%
Looks like the reason is not in Hangfire. Sorry.
What if you add some delay after acquiring and before releasing it? Please note it's better to use async Task.Delay or sync Thread.Sleep, depending on the code that's being replaced.
Will appreciate if you can share details for those native calls from profiler (that unfolded fat arrow in the bottom tree on the screenshot related to native calls in thread pool threads looks very interesting). Let's investigate this issue, it looks really strange, since all you have is SQL calls – by the way do you mean SQL Server or MySQL?
What if you add some delay after acquiring and before releasing it?
I've tried use BackgroundJob.Schedule(..., TimeSpan.FromSecond(1)) - didn't help
MySQL, okay, I will investigate something myself and return back with questions or answers.
Thank you.
The problem was in our code - there is a trotting module with semaphores and they are not disposed correctly.
Hi,
We use Hangfire 1.7.27 + Hagfire.Pro 2.3.0.0 + Hagfire.Pro.Redis 2.8.12.0
We are facing 100% CPU usage while hangfire is not processing any jobs after completing a recurring job
How does this happen:
I've collected trace at this moment on the server using dotTrace RunWorkLoop in DedicatedThreadPoolPipeScheduler takes 50% time Wait in CancellationTokenExtentions takes 20% time
This is RunWorkLoop code https://github.com/mgravell/Pipelines.Sockets.Unofficial/blob/19b141f30fe2485dd79993be0ac87a9fc6b34b74/src/Pipelines.Sockets.Unofficial/DedicatedThreadPoolPipeScheduler.cs#L166