timgit / pg-boss

Queueing jobs in Postgres from Node.js like a boss
MIT License
2.05k stars 157 forks source link

Pulling jobs on demand #6

Closed jtlapp closed 8 years ago

jtlapp commented 8 years ago

This proposal addresses issues #4 and #5, while still allowing all existing functionality.

Right now subscribers are rate-limited by the pre-configured newJobCheckInterval. Server capacity is going to vary over time. At present, a server that finds itself in a position to handle multiple jobs at once only has the option of setting a teamSize on subscription. However, the teamSize persists independently of the current server performance capacity, so the server can't use this option to vary work load.

If pg-boss provided an API for pulling a single job, the caller could dynamically decide how many simultaneous jobs it can handle and how often it can do jobs. If the caller wants to pull one job per interval of time, as pg-boss currently requires, the caller can still do that too with a pull-based API. If it were important for the queue interface to provide this functionality, it can either provide an additional method or a wrapper object to do the job.

The only potential issue I can see is that this interface would not be compatible with receiving asynchronous notification of new jobs from the database. (I understand that PG provides a NOTIFY feature, but I don't know how it's used or whether it can reach client computers.) It would be nice to have an API that could handle this as well, even if it were a separate feature.

jtlapp commented 8 years ago

If we want to help protect calling applications from poor queue use, so that the app could sit in an infinite loop pulling jobs without restraint, we could institute a pre-configured delay that only kicks in when the prior call for a particular job type yielded no jobs. The pull API would still be non-blocking; it would just postpone servicing the first pull attempt following an empty pull.

(Actually, I think you'd have to implement a timer upon the first empty pull and hold off servicing until timeout. You wouldn't want to initiate the timer at the next pull, in case the application is already incorporating a delay between pulls.)

timgit commented 8 years ago

Joe, I'm not following your proposal here. Jobs are currently fetched from the db using an interval and by definition are currently "pulled". teamSize offers a level of concurrency for deciding how many jobs can be worked within the check interval.

jtlapp commented 8 years ago

Closing until I've made a better study of the code. Thanks for your help!