timgit / pg-boss

Queueing jobs in Node.js using PostgreSQL like a boss
MIT License
1.79k stars 149 forks source link

PgBoss does not stop maintenance operations if active workers have not completed by the time the graceful timeout is up #361

Closed stephhuynh18 closed 1 year ago

stephhuynh18 commented 1 year ago

I have been getting the following error when running my tests suites. This error appears in an unreleated test suite that does not use pgBoss but is after the test suite that does use pgBoss:

    Cannot use a pool after calling end on the pool

      at BoundPool.connect (../../node_modules/pg-pool/index.js:168:19)
      at BoundPool.query (../../node_modules/pg-pool/index.js:392:10)
      at PgWrapper.executeSql (src/state-management/job-queue.ts:1407:20)
      at Boss.getMaintenanceTime (../../node_modules/pg-boss/src/boss.js:225:38)
      at Timeout._onTimeout (../../node_modules/pg-boss/src/boss.js:81:41)

It seems that when i call stop using the graceful timeout, if the timeout is reached before the workers are finished their current tasks, some of the internal workers used by PgBoss are not stopped (ex. the maintenance worker). I can see this behavior in the below code found in this repo's index.js. I have added a comment to illustrate this.

    setImmediate(async () => {
      let closing = false

      try {
        while (Date.now() - this.stoppingOn < timeout) {
          if (this.manager.getWipData({ includeInternal: closing }).length === 0) {
            if (closing) {
              break
            }

            closing = true

            await this.boss.stop()
          }

         // this.boss.stop() has not been called here if there are active jobs still running.

          await delay(1000)
        }

        await shutdown()
      } catch (err) {
        this.emit(events.error, err)
      }
    })
  }

I expected that pg boss would still stop the stopping maintenance operations once the timeout is reached but this is not the case. Is this the desired behavior?

timgit commented 1 year ago

Good catch! This was resolved in 8.3.1