timgit / pg-boss

Queueing jobs in Postgres from Node.js like a boss
MIT License
2.15k stars 160 forks source link

Easier observability for expired jobs #423

Closed peterwooden closed 3 months ago

peterwooden commented 1 year ago

Hello,

We want to be able to log all job state changes (except to archive), and we don't have a good way to log when a particular job expires. These logs need to contain job metadata (like id, name, etc), so we can't rely on aggregate metrics.

We could enable onComplete for all jobs, and then in the completion handler, log jobs with the expired state. But that would double the size of our jobs table, which is not preferable - we otherwise don't use completion jobs.

Our current solution is to patch pg-boss so that completion jobs are always inserted when a job gets expired:

--- a/src/plans.js
+++ b/src/plans.js
@@ -500,7 +500,8 @@ function expire (schema) {
     FROM results
     WHERE state = '${states.expired}'
       AND NOT name LIKE '${COMPLETION_JOB_PREFIX}%'
-      AND on_complete
+      -- Run completion jobs for all expired jobs so we can track them
+      -- AND on_complete
   `
 }

... and then to log the job metadata in a centralized onComplete handler:

// Simplified
await pgBoss.onComplete("*", { batchSize: 100 }, (jobs) => {
  for (const job of jobs) {
    if (job.data.state === "expired") {
      this.logger.error({ job: job.data }, "JobExpired");
    }
  }
});

It would be great if there was some way to log expired jobs without needing to either patch pg-boss or without needing to set on_complete for all jobs before they run.

Thanks!

timgit commented 1 year ago

This should hopefully be resolved in the next semver major (v10), which will kill off completion jobs in favor of dead letter queues.