timgit / pg-boss

Queueing jobs in Postgres from Node.js like a boss
MIT License
1.95k stars 153 forks source link

Production only error: Job gets created, but does not get picked up from queue, stays in created state forever #322

Closed sathwik77 closed 2 years ago

sathwik77 commented 2 years ago

I'm creating a job to be executed after 2 minutes from creation, the job does not get picked up from the queue. It stays in created state until the expiration time. Works perfectly fine in local, but in our staging server, it gives this problem. Checked with DB roles and permissions, as long as the application is able to create schema and able to add jobs in job table in staging DB, I feel there is no problem with permissions in picking up jobs and executing them.

Here is my configuration:

async function scheduler(doc_url, doc_name, orderId, data, email, name) {
  const PgBoss = require('pg-boss');
      let { username, password, host, port, database } = config
      const boss = new PgBoss(`postgres://${username}:${password}@${host}:${port}/${database}`, {}, {monitorStateIntervalSeconds: 60}, {}, {deleteAfterDays: 1});

      boss.on('error', error => {
          console.error(error);
          sendLoggerData(orderId, `pg Boss setup error: ${error}`)
      });

      await boss.start();

      const queue = orderId;

      const params = {
          doc_url: doc_url,
          doc_name: doc_name,
          orderId: orderId,
          data: data,
          email: email,
          name: name
      }
      let jobId = await boss.send(queue, params, { startAfter: 120, retryLimit: 2, onComplete: false, retentionMinutes: 16 });
      sendLoggerData(orderId, `created job in queue ${queue}: ${jobId}`);
      await boss.work(queue, asyncJobHandler);
      return true;
    }
sathwik77 commented 2 years ago

Update: Tried with docker image from local connected to staging DB, it works. Same docker image connected to same staging DB from EKS does not work

timgit commented 2 years ago

scheduler() appears to have ownership of the PgBoss instance, so if you let this function return, the created object will be garbage collected. Consider storing the boss instance into a higher-level variable outside of this function and think of it more like a connection pooler that should be long-lived

sathwik77 commented 2 years ago

Moved pgBoss instance to the outside of scheduler() to a higher-level variable, tried with returning and also not returning from scheduler function, still, it doesn't work. The fact that this works with local DB and also RDS staging DB even without these changes still remains a mystery to me. It also works with Docker image built locally, this only doesn't work with same locally built docker image deployed to EKS.

sathwik77 commented 2 years ago

Finally found the problem, there is nothing wrong with the way I implemented scheduler() the problem is with PM2 and my log file management, I'm updating log file just before creating job with pgBoss and also updating log file when job becomes active, this change in log files is making PM2 restart, that's causing pgBoss not to pick up jobs in expected behavior. Had to add "ignore_watch":["logs"] (my logs directory) to my PM2's processes.json