getsentry / sentry-dotnet

Sentry SDK for .NET
https://docs.sentry.io/platforms/dotnet
MIT License
572 stars 204 forks source link

Crons: Sentry.Hangfire not reporting finished jobs and reporting jobs that were not started #3436

Open jhartmann123 opened 1 week ago

jhartmann123 commented 1 week ago

Package

Other

.NET Flavor

.NET Core

.NET Version

8.0.5

OS

Any (not platform specific)

SDK Version

4.7.0

Self-Hosted Sentry Version

No response

Steps to Reproduce

  1. Create a Cron called "its-a-test", set the timeout to 1 minute
  2. Repro:
    
    SentrySdk.Init(options =>
    {
    options.Dsn = "<DSN>";
    options.Environment = "development";
    });

GlobalConfiguration.Configuration .UseSentry() .UseInMemoryStorage() .UseColouredConsoleLogProvider();

var server = new BackgroundJobServer(); await Task.Delay(100); // Otherwise the job does not get enqueued

BackgroundJob.Enqueue(t => t.Test()); BackgroundJob.Enqueue(t => t.Test());

await Task.Delay(TimeSpan.FromMinutes(1));

server.Dispose();

public class FancyTest { [DisableConcurrentExecution(timeoutInSeconds: 0)] [AutomaticRetry(Attempts = 0, OnAttemptsExceeded = AttemptsExceededAction.Delete)] [SentryMonitorSlug("its-a-test")] public Task Test() { Console.WriteLine($"{DateTime.UtcNow} I'm a test"); return Task.Delay(TimeSpan.FromSeconds(15)); } }



Note that the second enqueued job does not start due to the `DisableConcurrentExecution`-Attribute. A `DistributedLockTimeoutException` gets thrown instead. It also does not appear as failed in Hangfire due to 
 `AttemptsExceededAction.Delete`

### Expected Result

- The first Hangfire job should be reported as "OK" in Crons
- The second Hangfire job should not be reported at all, or, alternatively, I should be able to configure some kind of Exception filter, where I can deem the `DistributedLockTimeoutException` as OK.

Background:  
We have multiple recurring jobs that run every 15 minutes, but can take longer than that to finish. They should not run in parallel (which is handled with the `DisableConcurrentExecution`-attribute), and it's fine if one execution is missed (-> `AttemptsExceededAction.Delete`). We only want to get alerted if a running Job failed, if it's running for too long or did not start at all.

### Actual Result

Both Background-Jobs get reported as "In Progress" and time out eventually. None of them goes to "Error" or "Ok"

Issue is probably related to https://github.com/getsentry/sentry-dotnet/issues/3262
bitsandfoxes commented 1 week ago

Thanks for the detailed repro! It'll help out a bunch fixing this!