lykmapipo / kue-scheduler

A job scheduler utility for kue, backed by redis and built for node.js
246 stars 47 forks source link

Job 'xxxx' does not exist #22

Closed KeKs0r closed 8 years ago

KeKs0r commented 8 years ago

I am experiencing the following error messages: "data": { "name": "Error", "message": "job \"176728\" doesnt exist" }

I was creating the job:

            const job = queue.create('updateAnswerJob', {event: event})
                .priority('medium')
                .ttl(1000 * 20)
                .removeOnComplete(true)
                .unique('updateAnswerJob_' + event.game);
            queue.now(job);

When I changed the job creation to normal kue, it worked again:

            const job = queue.create('updateAnswerJob', {event: event})
                .priority('medium')
                .ttl(1000 * 20)
                .removeOnComplete(true);
           job.save(function(err,res){
           });

I just tried to reproduce this locally or on my test environment, which I couldn't. So I cant even figure out what the stacktrace was.

Did anyone else experience something similar?

lykmapipo commented 8 years ago

@KeKs0r

So far i have not but i will try to reproduce

IvanMMM commented 8 years ago

It happened to me too when UI was opened in browser.

lykmapipo commented 8 years ago

@KeKs0r

Try not to auto remove the job i.e remove removeOnComplete(true) from job creation and use queue events to remove it after its complete.

KeKs0r commented 8 years ago

@IvanMMM Good Idea, I will change it back and then monitor it for a day, if the error occurs or if it only occurs in when I have the UI open. Although it did not happen when I was only using kue without kue-scheduler.

_Update_ It seems that I also got those errors overnight, when I was not using the admin-ui

@lykmapipo : I will try to change it when it is not related to the admin UI. Problem is, that I don't have a stacktrace, so I don't know if the error occurs from a admin request or somewhere while job processing.

KeKs0r commented 8 years ago

After modifying my logging library I finally managed to get the stacktrace for this:

Error: job \"217859\" doesnt exist
 at /srv/www/node_modules/kue/lib/queue/job.js:169:17
 at try_callback (/srv/www/node_modules/kue/node_modules/redis/index.js:592:9)
 at RedisClient.return_reply (/srv/www/node_modules/kue/node_modules/redis/index.js:685:13)
 at ReplyParser.<anonymous> (/srv/www/node_modules/kue/node_modules/redis/index.js:321:14)
 at emitOne (events.js:77:13)
 at ReplyParser.emit (events.js:169:7)
 at ReplyParser.send_reply (/srv/www/node_modules/kue/node_modules/redis/lib/parser/javascript.js:300:10)
 at ReplyParser.execute (/srv/www/node_modules/kue/node_modules/redis/lib/parser/javascript.js:211:22)
 at RedisClient.on_data (/srv/www/node_modules/kue/node_modules/redis/index.js:547:27)
 at Socket.<anonymous> (/srv/www/node_modules/kue/node_modules/redis/index.js:102:14)
 at emitOne (events.js:77:13)
 at Socket.emit (events.js:169:7)
 at readableAddChunk (_stream_readable.js:146:16)
 at Socket.Readable.push (_stream_readable.js:110:10)
 at TCP.onread (net.js:523:20)

Unfortunately I only know, that it happens during an job.get call, trying to figure out where that is called from.

lykmapipo commented 8 years ago

@KeKs0r

Is it thrown from ui or scheduler

KeKs0r commented 8 years ago

Okay so I changed where the error is actually created (before the callback, in order to trace where the getcall is called:

Error: job \"585\" doesnt exist(Custom)
 at Function.exports.get (/srv/www/node_modules/kue/lib/queue/job.js:174:25)
 at Job.tryGetExistingOrSaveJob (/srv/www/node_modules/kue-scheduler/node_modules/kue-unique/index.js:194:25)
 at fn (/srv/www/node_modules/kue-scheduler/node_modules/async/lib/async.js:741:34)
 at /srv/www/node_modules/kue-scheduler/node_modules/async/lib/async.js:1208:16
 at /srv/www/node_modules/kue-scheduler/node_modules/async/lib/async.js:166:37
 at /srv/www/node_modules/kue-scheduler/node_modules/async/lib/async.js:701:43
 at /srv/www/node_modules/kue-scheduler/node_modules/async/lib/async.js:167:37
 at /srv/www/node_modules/kue-scheduler/node_modules/async/lib/async.js:1204:30
 at /srv/www/node_modules/kue-scheduler/node_modules/async/lib/async.js:52:16
 at Immediate._onImmediate (/srv/www/node_modules/kue-scheduler/node_modules/async/lib/async.js:1201:34)

@lykmapipo : I think it is thrown from scheduler

lykmapipo commented 8 years ago

@KeKs0r

I will work on it. Thanks

KeKs0r commented 8 years ago

I guess it is somehow caused here:

https://github.com/lykmapipo/kue-unique/blob/master/index.js#L201

and then within Async the "not finding error", which is a expected behaviour is bubbling up.

update Apparently: I changed one of my jobs from my manual "rescheduling", which is: every time the job runs, it schedules the next one, to kue-scheduler, but then this issue arised a lot and apparently the job did not run at all. So this seems to be a bigger problem then just cleaning up the job too early

KeKs0r commented 8 years ago

@lykmapipo Did you find anything?

lykmapipo commented 8 years ago

@KeKs0r

I have been in the tight schedule. Can you help me with the use case that cause a problem? I will appreciate a spec that fails so that i can work from there.

lykmapipo commented 8 years ago

@KeKs0r & @IvanMMM

I made a followup on base on stacktrace provided by @KeKs0r.

It turns out to me if you have deleted the job using kue specific functionalities especially click delete from the kue-ui you will end up having errors. This happens for unique jobs as kue-unique will tell scheduler that there is an existing unique job and when kue-scheduler try to fetch existing unique job and fire it for execution you will end up having error same as trace provided by @KeKs0r.

To fix it:

Cheer.

KeKs0r commented 8 years ago

@lykmapipo I am not deleting anything manually. I am using the removeoncomplete for some jobs. I think if that is causing an issue, we should ensure that removeoncomplete also works for kue-scheduler.

lykmapipo commented 8 years ago

@KeKs0r

That is what actually lead you to problems. If you remove job on complete it does not guarantee that a schedule is also delete.

If you want to delete jobs on completion and recreate new ones please do not flag them as unique cause unique jobs maintain a single job instance and if schedule fail to find that job, it will throw an error.

If you want to delete a job and its schedule pleas use remove jobs from kue-scheduler, that will guarantee safety.

Hope it helps.

KeKs0r commented 8 years ago

Okay so is "removeoncomplete" incompatible with scheduled jobs or with unique jobs?

I have a job that is unique and runs every minute. I just don't want to keep the logs if the job was successful.

Would you be in favor of rewriting the functionality of removeoncomplete to properly delete the job after completion? E.g. then just using the remove job functionality to remove the job by id?

lykmapipo commented 8 years ago

removeoncomplete is incompatible with unique jobs. A purpose of unique job is to keep a single job instance and reuse it on every run.

Based on your scenario

I think there is no need to rewrite removeoncomplete, all you have to do is to decide whether to use unique job or not. Use unique job if you want to maintain single job instance on every run.

Hope it helps.

KeKs0r commented 8 years ago

okay, if it is a single job instance what happens to the logs or what happens if the job fails? I just want to get rid of all information in redis if the job is sucessfull. If the same instance is reused somehow and there is no "log" fingerprint or something on redis I am completely fine with it.

lykmapipo commented 8 years ago

If you return either an error or success details as specified in kue job processing then kue will save them accordingly.

If success details passed, you can access them on job.result and if its error they get displayed in kue-ui. If you do not need either just envoke done() with no params which i think you will loose track of what going on.