marcoCasamento / Hangfire.Redis.StackExchange

HangFire Redis storage based on original (and now unsupported) Hangfire.Redis but using lovely StackExchange.Redis client
Other
448 stars 108 forks source link

ObjectDisposedException is thrown after restarting Hangfire #112

Closed hankovich closed 4 months ago

hankovich commented 2 years ago

I see a lot of ObjectDisposedException in my logs after restarting Hangfire server like

await BackgroundJobServerHostedService.StopAsync();
await BackgroundJobServerHostedService.StartAsync();

While they do not block the execution of scheduled jobs, they increase latency dramatically (from a couple of milliseconds to 3-5 seconds).

dbug: Hangfire.Processing.BackgroundExecution[0]
      Execution loop Worker:08c21f76 caught an exception and will be retried in 00:05:00
      System.ObjectDisposedException: Safe handle has been closed.
      Object name: 'SafeHandle'.
         at System.Runtime.InteropServices.SafeHandle.DangerousAddRef(Boolean& success)
         at System.StubHelpers.StubHelpers.SafeHandleAddRef(SafeHandle pHandle, Boolean& success)
         at Interop.Kernel32.ResetEvent(SafeWaitHandle handle)
         at System.Threading.EventWaitHandle.Reset()
         at Hangfire.Redis.RedisSubscription.WaitForJob(TimeSpan timeout, CancellationToken cancellationToken)
         at Hangfire.Redis.RedisConnection.FetchNextJob(String[] queues, CancellationToken cancellationToken)
         at Hangfire.Server.Worker.Execute(BackgroundProcessContext context)
         at Hangfire.Server.BackgroundProcessDispatcherBuilder.ExecuteProcess(Guid executionId, Object state)
         at Hangfire.Processing.BackgroundExecution.Run(Action`2 callback, Object state)
seinti commented 2 years ago

Hello @hankovich. Were you able to resolve this issue?

hankovich commented 2 years ago

Nope. Moreover, after updating to .NET 6, we found an insane increase in latency (up to 1 minute and even more).

Don't know if it's a problem with the Hangfire itself, with StackExchange.Redis or with Hangfire.Redis.StackExchange

marcoCasamento commented 2 years ago

The increase in latency is interesting, I suspect the exact latency is 3 minutes, which is the allowed max time to wait for a new notification from Redis pub/sub to arrive. This library uses redis pub/sub mechanism to be notified when a new job get created, so to schedule it for execution as soon as possible. It seems that in your environment the subscription handler somehow get disposed, and everything fallback to a very slow polling with 3 minutes interval. Do you have a repro ? Besides that, which version of redis are you running ?

hankovich commented 2 years ago

Well, the problem with insane increase of latency which I mentioned here (https://github.com/marcoCasamento/Hangfire.Redis.StackExchange/issues/112#issuecomment-1095164688) was not caused by Hangfire.Redis.StackExchange. It was our change, we decided to add prefixes to all redis keys and implemented a decorator on top of IConnectionMultiplexer which simply calls underlying IConnectionMultiplexer, but implements GetDatabase like

public IDatabase GetDatabase(int db = -1, object? asyncState = null) =>
    _connectionMultiplexer
        .GetDatabase(db, asyncState)
        .WithKeyPrefix(_prefix); // build-in in StackEcxhange.Redis

And it appeared that a wrapped instance from WithKeyPrefix is much slower rather than an original instance from GetDatabase. We tried to cache them, but the problem is with the DatabaseWrapper itself.

But the original problem still exists. After restaring Hangfire server latency increases dramatically. I'll prepare a repro for that case

marcoCasamento commented 2 years ago

I don't know "WithKeyPrefix" but I assume it put a prefix on each key it uses, probably to keep all your keys in a single node of a redis cluster, but have you tried Prefixof RedisStorageOptions ? It serves this exact purpose.

wjl476146955 commented 1 year ago

hi,I met the same error too. and found the RedisSubscription's execute method is the issue point. Because the hangfire storage is a single pattern, and RedisSubscription is only init once in RedisStorage ctor. so when restart the hangfire server maunal or the code can't connect to the redis till the error time exceed the BackgroundJobServerOptions'ServerTimeout the hangfireserver will auto restart,then the heartcheckprocess will send the CancellationRequested so the mre will be disposed ,but the Storage will not be reinited because it is a single pattern.you can found this code in HangfireServiceCollectionExtensions