Open lightpriest opened 7 years ago
Hi, thanks for reporting this behavior. When running in the pure node.js environment, meteor-job is completely dependent on the node-ddp package for communication with the server, and to the extent that it doesn't handle the connection state well, meteor-job (or any other package built on top of it) will suffer.
Unfortunately, in my experience, there is no one organization or author willing or able to step-up and actively maintain that package. I have the permissions to commit and publish updates to that package on github and npm, and I occasionally will respond to high priority requests, particularly when they come in the form of a good quality pull-request. But unfortunately, as seems to be the case with all of the other contributors over there, I don't have time to go through that package and do everything that really should eventually be done.
If you can identify a focused set of changes to the ddp npm package that will help this issue (ideally as a PR) I would be happy to help integrate and test them and publish the results to npm for everyone's benefit.
TL;DR: I'm using
Job.processJobs
from an external node application (worker), with the DDP/ddp-login client. It works as expected, but if the DDP server is unavailable for longer than the getWork interval, it stops processing new jobs.It took me a while to figure out what's going on, but I think I got it. I first noticed that while developing a Meteor application. When it restarted for code changes, the worker sometimes stopped processing new jobs. When this happens, issuing a "soft" shutdown never seems to end, although
jobQueue.length
,jobQueue.running
andjobQueue.idle
are all 0.I tracked it down to the point where meteor-job calls the DDP method "getWork". If the DDP server is down at this point, the callback is never fired and "getWork" enters a state where it cannot run again (@_getWorkOutstanding is not false'd).
This seems like a larger issue with node-ddp-client method calls when the server is unavailable. The following snippet would not fire the "test callback" when the server is down.
One workaround suggestion is to pause all job queues when ddp is disconnected, and resume them when it reconnected.
Come to think of it, this is more of an issue in how node-ddp-client behaves when the DDP server is down, because meteor-job handles errors in the method callback properly.