kelektiv / node-cron

Cron for NodeJS.
MIT License
8.3k stars 617 forks source link

"Error: WARNING: Date in past. Will never be fired." crash inside callback wrapper #813

Open Cyberbeni opened 6 months ago

Cyberbeni commented 6 months ago

Description

I was using 2.2.0 but cross-checking the code with 3.1.6, this issue still exists.

I have a daily schedule to set up triggers for the day: https://github.com/Cyberbeni/node-red-contrib-cb-suncron/blob/3b3441c95a0fb4ab2356e906c810d3ef0e8de80f/src/SuncronLocation.ts#L29-L34

Setting up the jobs looks like this: https://github.com/Cyberbeni/node-red-contrib-cb-suncron/blob/3b3441c95a0fb4ab2356e906c810d3ef0e8de80f/src/Suncron.ts#L44-L59

I had this Node-RED crash today at the time that the cron job should have fired (and probably had multiple similar ones in the past as I had some skipped schedules):

9 Dec 15:49:24 - [red] Uncaught Exception:
9 Dec 15:49:24 - [error] Error: WARNING: Date in past. Will never be fired.
    at CT.sendAt (/data/node_modules/cron/lib/time.js:177:12)
    at CT.getTimeout (/data/node_modules/cron/lib/time.js:202:29)
    at Timeout.callbackWrapper [as _onTimeout] (/data/node_modules/cron/lib/job.js:141:36)
    at listOnTimeout (node:internal/timers:559:17)
    at processTimers (node:internal/timers:502:7)

Expected Behavior

If diff is more than 0 but newTimeout would be negative (because slow hardware or whatever), we shouldn't throw an error that can't be handled.

Actual Behavior

Occasional crash with the included call stack.

Possible Fix

Treat this.cronTime.getTimeout() throwing an error as if diff was not more than 0.

Steps to Reproduce

Remove if (diff > 0) check and insert a 1 second sleep before this line to imitate slow hardware:

https://github.com/kelektiv/node-cron/blob/d6143937d8774d8e4f5dcf3fcb86b296e2811c29/src/job.ts#L235

Then just setup a job with a single Date and an onTick callback.

Context

see description

Your Environment

nguyenvyy commented 5 months ago

I face this issue too, with the difference between luxon.DateTime.local() and the date being 1 millisecond. cron version: 2.3.1 image

sheerlox commented 5 months ago

What would be the best behavior in your opinion?

If your software is running on under-sized hardware / having bottlenecks in the implementation, and the planned execution date is missed, should we try to catch back (i.e. execute the tick ASAP)?

Some use cases might not go well with this, and having this error thrown indicates that to maintain consistency in your schedule, you should either upgrade the hardware you're running on or address potential bottlenecks in your software.

Let's get the discussion going on this, please let us know your thoughts.

Cyberbeni commented 5 months ago

My hardware is at 4% CPU usage at all times. From the mdn web docs, I see that the delay passed to setTimeout is a minimum, so if I set a date less than 24 hours away then diff should never be positive, yet it is.

Executing instantly instead of throwing an uncatchable error would be preferred in my case.

Cyberbeni commented 5 months ago

Is there a reason why we are calculating diff instead of just checking remaining > 0 in the first place?

sheerlox commented 5 months ago

Is there a reason why we are calculating diff instead of just checking remaining > 0 in the first place?

From https://github.com/kelektiv/node-cron/pull/259:

This patch prevents onTick() to be called twice if setTimeout callback was called earlier than expected due to time adjustment. It just measures how much time passed since we set timeout and if it's less than it should it reschedules onTick call.


Executing instantly instead of throwing an uncatchable error would be preferred in my case.

A solution I might think of would be to add a new parameter tolerableDelay (in ms) defaulting to 0 (no breaking change), which would be used somewhat like this:

if (this.realDate) { 
  if (DateTime.local() > date) { 
    if (DateTime.local() > date + tolerableDelay) { 
      throw new CronError('WARNING: Date in past. Will never be fired. Try adjusting the tolerableDelay parameter.'); 
    }

    return date + tolerableDelay;
  }

  return date;
}

This way you could control how much you want to allow executing jobs that missed the execution date.

Does that sound like an acceptable solution to your issue?

Implementation note: we would also need to apply that parameter in other places so it applies to regular cron jobs as well.

Cyberbeni commented 5 months ago

In case of tolerableDelay = Infinity, that results in getTimeout also returning Infinity.

It would fix my issue but it wouldn't fix people randomly running into this issue.

If setTimeout fires after the target date then that already allows infinite delays. It's only in the case of the second measurement inside the same runloop run being negative, while the first being positive when a crash happens instead of just a couple millisecond delay. If I understand how JS+this npm package works currently, let's say we set a cron job to trigger every second and the code being executed by the trigger takes 10 seconds then we constantly get 9-10 second delays and a lot of triggers are getting skipped.

So I think executing instantly in this case is still the best option. (And also most consistent with how things currently work)