ravendb / quartznet-RavenDB

RavenDB JobStore support for Quartz.NET scheduler.
Apache License 2.0
27 stars 19 forks source link

Jobs don't always start at the right time #22

Closed cadilhac closed 2 years ago

cadilhac commented 2 years ago

I recently noticed that when creating triggers with a StartNow() or StartAt() option, then, more often than not, the job is not started at the right time. For instance, I have a "send email" job configured to start in 2 seconds. Checking in the debugger, I see that it starts 25 seconds after the call. Now, if I use Quartz in memory instead of using a ravendb store, then jobs perfectly start at the right time.

Any idea about this mystery? @ayende ?

ayende commented 2 years ago

Maybe there is some delay in how often it checks for tasks? The refresh rate?

cadilhac commented 2 years ago

You mean quartz.net? Unsure, because sometimes a StartNow is done instantly, sometimes it waits far too many seconds...

ayende commented 2 years ago

Let's start from the basics. If you do this test with the traffic watch active, what do you observe?

cadilhac commented 2 years ago

Here is a screeshot of the traffic log. The first 8 rows appear when the job is created (this is a job that is set to start in 2 seconds). The next rows appear 25 seconds later.

timing

The job is PUT at 11:34:48. The time in the next trigger query is set to 30 seconds later at 11:35:18. Why? I don't know. Then at 11:35:13, i.e. 25 seconds later, it goes live again and the job is run. Does it make sense to you?

cadilhac commented 2 years ago

I made further tests and something is beyond my understanding. In RavenJobStore.AcquireNextTriggers, there is:

var triggersQuery = await session
    .Query<Trigger>()
    .Where(t =>
        t.State == InternalTriggerState.Waiting &&
        t.NextFireTimeUtc <= (noLaterThan + timeWindow).UtcDateTime)
    .OrderBy(t => t.NextFireTimeTicks)
    .ThenByDescending(t => t.Priority)
    .ToListAsync(cancellationToken);

I noticed that it was returning empty even if there was a trigger with a NextFireTimeUtc lesser than the (noLaterThan + timeWindow).UtcDateTime. So I added a test query like this:

var testQuery = (await session.Query<Trigger>().ToListAsync())
    .Where(t => (t.State == InternalTriggerState.Waiting) && (t.NextFireTimeUtc <= maxTime))
    .ToList();

This one returns correctly my trigger. So something bad happens with the comparison of dates on the ravendb server...

Many seconds later, when the method is fired again, then the dates comparison works and the trigger is returned, but late.

cadilhac commented 2 years ago

I think I get it. After the trigger is initially created and saved, the automatic index has not the time to update before the query is made. I found this out by adding an await Task.Delay(2000); before the query is made. Therefore I added some session.Advanced.WaitForIndexesAfterSaveChanges calls in each StoreJob/Trigger method and it fixed the issue.

ayende commented 2 years ago

Okay, that make sense, then. Thanks for figuring it out.