timgit / pg-boss

Queueing jobs in Postgres from Node.js like a boss
MIT License
2.15k stars 160 forks source link

Job stuck in "created" state when server resets #469

Closed bmorgan-aligntech closed 2 months ago

bmorgan-aligntech commented 3 months ago

Hey, We are attempting to use pg-boss as as solution for delaying some tasks on our backend, but we are having issues when the backend 'restarts'. We have this script that we have been using for testing:

registerJobHandler(jobName: string, handlerFunc: any) { 
    this.pgBoss.start().then(() =>
      // use work to process job with passed in handlerFunc
      this.pgBoss
        .work(jobName, async (job) => {
          await handlerFunc(job.data);
        })
        .catch((err) =>
          this.logger.log('Error executing: ${jobName}... ${err}'),
        ),
    );
  }

sendJob(
    jobName: string,
    data: any,
    delayInHours: number,
    retryLimit: number,
  ) {
    // convert delay to milliseconds
    const startAfter = new Date(Date.now() + delayInHours * 60 * 60 * 1000);

    return this.pgBoss.start().then(() =>
      this.pgBoss.send({
        name: jobName,
        data: data,
        options: {
          retryLimit: retryLimit,
          startAfter: startAfter,
        },
      }),
    );
  }

// registers the job handler and sends the job
createJob(jobInfo: CreateJobDto) {
    this.registerJobHandler(jobInfo.jobName, jobInfo.handlerFunc);
    return this.sendJob(
      jobInfo.jobName,
      jobInfo.data,
      jobInfo.delayInHours,
      jobInfo.retryLimit,
    );
  }

Then we use it like this:

// creates pg-boss job with some test data, delay for 9 minutes
testEndpoint(requestor: string) {
    return from(
      this.pgBossService.createJob({
        jobName: 'test-job',
        data: {
          num: 5,
        },
        handlerFunc: this.testHandler,
        retryLimit: 1,
        delayInHours: 0.15, // 9 minutes
      }),
    ).pipe(
      // ...other tasks irrelevant to pg-boss
      }),
    );

// simply prints out the job data given and returns it
testHandler(jobData: any) {
    console.log('jobData: ', jobData.num);
    return jobData.num;
  }

When we run our application locally and call the testEndpoint, the job successfully gets created in our database with the correct data, delayed start, and retry limits. However, while waiting on the testHandler to execute, if the backend gets 'restarted' (due to updating files with nest.js' --watch flag), the testHandler will not execute after the 9 minute delay, and the job will remain in the created state in our database for the duration of the keepuntil timestamp.

If we do not 'restart' our backend while testing, then the testHandler will execute as expected.

This is not great for us, as the jobs we are expecting to delay, will be delayed for 1-2 hours, meaning any updates or production promotions during that timeframe will seemingly 'erase' those jobs.

We wanted to reach out to see if there is a work around with this issue. Is this expected behavior? Is there some way we are configuring the jobs that are incorrect, leading to this issue?

Like I mentioned above, we are using Nest.js as our backend framework.

timgit commented 3 months ago

I can't tell from your code what the issue is, but it doesn't tend to match how I normally structure my applications. I'd recommend isolating the pg-boss instance behind a service you can import elsewhere, which would have its own bootstrapping routine that happens at startup, including registering handlers. I don't normally wait until I create a job to register a handler, for example. That means the handler would not be registered on restart until you send another job?

bmorgan-aligntech commented 2 months ago

Thanks for the reply. We moved our handler registration and queue creation to our service's constructor so it will register and create on startup, rather than when a job is created, and that seemed to do the trick!