kelektiv / node-cron

Cron for NodeJS.
MIT License
8.41k stars 621 forks source link

cron job stops after certain hours. #232

Closed mek-omkar closed 6 years ago

mek-omkar commented 8 years ago

var job = new CronJob({
cronTime: '/1 * * * * *', onTick:function() { console.log(new Date, 'tick triggered'); }, onComplete: function(){/*/}, start: true, runOnInit: true }); job.start();

this job hanged & not seeing any logs after certain hours of execution. i am using nodejs 6.1.0 & cron 1.1.0

is cron supports every second jobs && if we not mentioned timezone which time zone takes for the job?

Thanks

frabbit commented 8 years ago

I can confirm this bug, it was a pain to debug, because i thought the problem was on my side.

TJNevis commented 8 years ago

I had the same issue...for me, it runs for a few days just fine - I have an hourly cron - and 2 days ago it stopped with no errors in the node log.

KKKSzili commented 8 years ago

I can confirm as well, i have cronjob running each second, but hangs after a while. The issue for me is critical because i am switching off water pump (when i do not have enough water in my well). Using on raspberry pi with node v4.2.1 and cron 1.1.0 .

//Implementation
new CronJob('* * * * * *', function() {
     //My code here
},null,true,'Europe/Budapest',null,true);

Any help would be apreciated. Thank you!

jeanmatthieud commented 8 years ago

Same problem here. I did not find any reason for that to happen in my code (every error is logged with sentry, and nothing particular happen)

colmaengus commented 8 years ago

I have seen this problem also. We have an every minute cron job that runs for days and days and then just stops. I'm tracking the duration between ticks and its normally 60 seconds. The last tick before it stopped was 147 seconds. Maybe this has something to do with the root cause ?

mek-omkar commented 8 years ago

yes i too suspecting it @colmaengus

Hithim commented 8 years ago

Hello guy's. I've also caught this bug, it's really hard to reproduce it, for me it happens when there is intense CPU and memory allocation. I've figured out for now that it stops task here: https://github.com/ncb000gt/node-cron/blob/master/lib/cron.js#L453 . So for some reason timeout become a negative value, I've tested with */1 * * * * * pattern.

colmaengus commented 8 years ago

We are seeing this quite a lot now. Adding some extra logs it looks to be when node is running so slowly that the next tick time comes around while you are trying to figure out what that next time should be. start of getTimeout generated "now" as 12:45:01 at 12:45:08.329 exit of _getNextDateFrom logged next tick to be 12:46:00 at 12:45:52.372 exit of sendAt logged next tick to be 12:46:00 at 12:46:00.399

Apart from the discrepancy of some seconds in the logs it would appear that _getNextDateFrom took so long to run that the next time to tick had come around already. The cron job was set to run every minute so there should have only been max 60 iterations through the while loop so I don't know what would cause the extreme slowdown. Apart from the node.js process in general being starved of cpu cycles.

[2016-08-18 12:45:08.329] [INFO] scheduler - getTimeout: 2016-08-18T12:45:01+01:00 [2016-08-18 12:45:52.372] [INFO] scheduler - _getNextDateFrom: 2016-08-18T12:46:00+01:00 [2016-08-18 12:46:00.399] [INFO] scheduler - sendAt: 2016-08-18T12:46:00+01:00 [2016-08-18 12:46:01.032] [INFO] scheduler - timeout:-1

akhare22sandeep commented 7 years ago

We are also facing same issue . Our job runs every second and it works perfectly for a day or max 2 days but after that we don't see any logs , connections to DB are also lost . it just hangs .... Restarting the job again works fine. till now i was thinking its code issue but other people are also facing the same issue. Please let us know if there is work around for this or otherwise we will have to change the module. Help is appreciated

ncb000gt commented 7 years ago

This issue is the same as #231 - I'm looking into this now.

Sorry for the delayed response. I've had no time to look into my open source projects. Thanks for digging into it - I'll share anything I find here.

anthonywebb commented 7 years ago

Keep us posted, thanks!

soundslocke commented 7 years ago

Glad to see some more info being uncovered. This has been going on a while, see #141. A fix was attempted with #147.

medisoft commented 6 years ago

Is this fixed in 1.3? With 1.1 I still having the problem.

Shayko94 commented 6 years ago

Unfortunately error is still there. Cron just stops after certain hours without any error message :(

colmaengus commented 6 years ago

As a workaround I'm using the onComplete event and if it is meant to be still running I call start again.

gonscenna commented 6 years ago

Error is still there on version 1.3

Alessy commented 6 years ago

I really don't understand why in start function if timeout < 0 function stop is called.

ncb000gt commented 6 years ago

I've merged in a few prs that should help with this. Please let me know if you're still having the issue.

@Alessy negative timeouts aren't valid. I could change the behavior to just keep going, but it would likely cause a skip. Really this should either send a warning to the console or throw. Which behavior would you prefer to see?

ncb000gt commented 6 years ago

Closing for now. If this is still an issue we can take it as a new issue.

hebo-hebo commented 5 years ago

Anyone knows which commit is for this fix?

hebo-hebo commented 5 years ago

@gonscenna

Error is still there on version 1.3

Does 1.4.1 make a difference in your case? Thanks

dptole commented 5 years ago

@hebo-hebo Same problem here.

The cronjob._timeout property is very similar to the return of the setTimeout function. Sometimes it contains properties that indicates a dead cronjob._timeout that will never run. One of those properties are:

cronjob._timeout._idlePrev = null
cronjob._timeout._idleTimeout = -1

In my experience this problem seems to happen when there are CPU intensive operations happening. The node-cron tries to create a new setTimeout but fails and can't recover. Thats my hypothesis at least.

My temporary solution was to create another cronjob that ressurects dead cronjobs. This is how I try to find dead cronjobs

// FILE: cronjob-a.js
let cronjob = new CronJob(CRON_TIME, CRON_JOB_FUNCTION);

// FILE: ressurect-cronjobs.js
for(const cronjob of GET_ALL_CRONJOBS()) {
  if(cronjob._timeout._idleTimeout < 0)
    RESSURECT_FUNCTION(cronjob);
}
ncb000gt commented 5 years ago

@dptole Interesting. It would be great to have something like this built into the library, but if it can't get the timeout then it likely wont succeed at getting a new one a few ticks later and it doesn't make sense to use a timeout to try to get a new timeout.

For now it makes sense to me to use the approach that you are here and I'll mull this around a bit.

colmaengus commented 5 years ago

How about the following ?

  1. Add a check to determine if a cron job is period or one-shot
  2. Add an internal _restart function that does stop/start without calling onComplete
  3. If periodic cron and timeout goes below 0 then call _restart() instead of stop()

I'm doing pretty much this in an external wrapper.

dptole commented 5 years ago

@colmaengus correct me if I'm wrong but what I grasped from what @ncb000gt said was:

Unless you are doing these checks without the setTimeout/setInterval functions (although I think it is being done internally).

In my opinion, if the setTimeout/setInterval failing are core bugs, the ultimate solution would require issuing a new process to keep an eye on that. But that would required a lot more work to fix an issue that happens in very specific use cases.

Maybe the README.md should be updated to warn people of this issue could help... maybe an issue should be created on the nodejs core library... I don't know. But a final solution, I think, should come from v8/nodejs releases.

ncb000gt commented 5 years ago

@colmaengus @dptole I'd be very skeptical that timers and intervals were somehow not working in node code. I would definitely assume it's the library before suspecting the node implementation.

@colmaengus I think that sounds reasonable. It should already check to see if the job is a one shot or periodic job and there is some checking to determine if the timeout is too large. Clearly, we need the opposite.

hebo-hebo commented 5 years ago

Saw it happened three times on my two different servers. Is there a utility like pstack or jstack, which can peek into the node.js application to see where it got stuck. Or would it be possible to emit log messages when it runs into this situation (no more launch of cron jobs)? So we can confirm the root cause for sure.

zhangxiang958 commented 5 years ago

hello guys, this bug is still not fixed in 2.0.3 ? or Anyone knows which commit is for this fix?

i'm using version 1.1.0, and found cronjob will be stoped when set in per second:

(new CronJob('* * * * * *', () => {}, null, true, 'Asia/Shanghai')).start()

anyone know why this bug happend? because it is really hard to reproduce it, i had tried to create 5000 process in my server want to keep CPU and memory busy, but still not reproduce it, anyone know how to reproduce this bug?

ncb000gt commented 5 years ago

@zhangxiang958 The module is at 1.6.0. I'd recommend trying that version.

As far as reproduction, that's part of the problem. People are/were hitting it, but it's not easy to recreate in a test.

I made some changes to the module with the latest version in the middle of December. Try that version and let me know if you're still running into this.

hebo-hebo commented 5 years ago

My two servers are using the old cron version version '1.1.0'. There are 4 tasks configured and each runs every four minutes. The cron does not fire any more jobs after 25 days. Already happened 3 times on my two servers. Curious to know if there was any bug like that?

ncb000gt commented 5 years ago

@hebo-hebo There have been reports like that in the past. The cases have varied related to timezones to a couple other possible causes. So far, in the latest version we haven't seen reports of this. So, presumably the issue is resolved. Please let us know if you see this behavior on the latest version. Thanks!

eranbetzalel commented 5 years ago

Out production tests shows that a Job failed to execute due to node-cron job that did not 'tick'. The cron defined as "14 /1 *", successfully executed every minute, but stopped at 4am for some reason. I had to fallback to setInterval as I can't trust node-cron to be resilient enough.

ncb000gt commented 5 years ago

@eranbetzalel You can do what you feel is best obviously. Sorry that the job didn't execute. Which version of the module were you using? What were the conditions on the system, high load and what kind of processing was the ontick handling for you?

eranbetzalel commented 5 years ago

You're right, forgot to mention that.

Version 1.6.0

I didn't see any CPU high-load in Google's CPU graphs.

The on tick ran a lambda expression that run some job execution...

ncb000gt commented 5 years ago

@eranbetzalel The latest version is 1.7.0 to fix an issue related to DST found in GH-408.

Given the time frame I suspect this may be what happened. Would you be interested in confirming?

eranbetzalel commented 5 years ago

I'll look into it whenever I'll have some free time, probably not in the near future.

ncb000gt commented 5 years ago

@eranbetzalel ok. regardless, thanks for letting me know you ran into an issue.

aaxc commented 3 years ago

Problem is still there, just to let you all know

abrar71 commented 3 years ago

I can also confirm the issue still exists

ChrisvanChip commented 3 years ago

Just experienced this issue, still needed to fix.

Sir-hennihau commented 3 years ago

Another one to confirm this exists.

I thought my server had a memory leak and I was investigating why it crashed. Took me weeks to fix. In the end I replaced node-cron and all my problems disappeared. Sorry to say that, but that's a really bad bug, especially if no error is thrown when the server crashes.

LucCADORET commented 2 years ago

I think I experienced this bug today also. Hard to know since I couldn't set a breakpoint and debug, but everything matches: 1 second cron job, no error thrown.

iamkhalidbashir commented 2 years ago

Same problem for mere on AWS tiny instances with low cpu and ram

spandey1296 commented 1 year ago

I have scheduled cron on the server but it gets stopped automatically after 1-2 days or sometimes later. @ncb000gt pls help into it. using "cron": "1.7.1", "cron-parser": "^3.5.0",

spandey1296 commented 1 year ago

@eranbetzalel You can do what you feel is best obviously. Sorry that the job didn't execute. Which version of the module were you using? What were the conditions on the system, high load and what kind of processing was the ontick handling for you?

TinyDinosaur commented 1 year ago

Hey guys, this is still hapening on version 3.0.0. Same symptoms, no exceptions, no warnings nothin, it simply stops executing the cronjob. This usually happens between 4-6 hours for me.

intcreator commented 1 year ago

that's crazy seeing as we're only on version 2.3.1. what do you have installed in your package.json?

TinyDinosaur commented 1 year ago

Omg I have to say I was wrong. This was happening on the library node-cron, and I had two tabs opened and I made the comment on the wrong one. I'm so sorry for the trouble. All is well, this is the one that actually works, Again, sorry.

intcreator commented 1 year ago

no worries haha. feel free to switch if you want