timgit / pg-boss

Queueing jobs in Postgres from Node.js like a boss
MIT License
2.13k stars 158 forks source link

Provide a way to exit pg-boss when all job are done #190

Closed apiel closed 3 years ago

apiel commented 4 years ago

We are using pg-boss to run multiple job in parallel. But once the work is done, we would like to exit the process. Or right now it is kind of difficult to achieve.

Would it be possible to have a feature to know if some jobs are currently processing and a way to inform us that all jobs has been completed?

Right now I am doing like this, but it is really cumbersome:

     const boss = new PgBoss({...});

     let jobsProcessing = false;
    const onAllJobsDone = new Promise((resolve) => {
        boss.onComplete('*', async () => {
            const queueSize = await queueManager.getQueueSize(QueueName.crawl);
            queueSize === 0 && resolve();
        });
    });

    await boss.subscribe( 'hello', { },
        async ({ data: { url } }: Job<{ url: string }>) => {
            jobsProcessing = true;
            processUrl(url);
        },
    );

     boss.publishOnce('hello', { url: 'http://localhost/1' });
     boss.publishOnce('hello', { url: 'http://localhost/2' });

     jobsProcessing && (await onAllJobsDone);
     console.log('All jobs are done.');
     await boss.stop();
     console.log('Queue is empty, prepare to exit process.');
apiel commented 4 years ago

Another work around would be:

     const boss = new PgBoss({...});

     let jobsProcessing = 0;

    await boss.subscribe( 'hello', { },
        async ({ data: { url } }: Job<{ url: string }>) => {
            jobsProcessing++;
            processUrl(url);
            jobsProcessing--;
        },
    );

     boss.publishOnce('hello', { url: 'http://localhost/1' });
     boss.publishOnce('hello', { url: 'http://localhost/2' });

     while (jobsProcessing > 0) {
        await delay(1);
     }
     console.log('All jobs are done.');
     await boss.stop();
     console.log('Queue is empty, prepare to exit process.');

But this make it even difficult if I have to track multiple queues...

timgit commented 4 years ago

If you know how many jobs you are publishing, you could use the counter approach to compare how many have completed against that number. getQueueSize() ignores completion jobs. Keep that in mind since you may exit too soon if you need to use onComplete() subscriptions. This assumes you're running a single instance in memory with concurrency set to 1, too.

apiel commented 4 years ago

@timgit this is actually the problem, counting running job within the NodeJs process doesn't work well with concurrency. Or one of the main idea to use a job queue is to use concurrency. I hope we can one day have a feature like getRunningJobs() integrated in pg-boss.