breejs / bree

Bree is a Node.js and JavaScript job task scheduler with worker threads, cron, Date, and human syntax. Built for @ladjs, @forwardemail, @spamscanner, @cabinjs.
https://jobscheduler.net
MIT License
2.96k stars 80 forks source link

[fix] Jobs starting slowly #184

Closed maxpain closed 2 years ago

maxpain commented 2 years ago

Describe the bug

Node.js version: 18.4.0

OS version: Container-Optimized OS with containerd (cos_containerd)

Description: It took ~30 seconds to start a Job

Actual behavior

The thread/worker goes online fast, but actual code execution starts after ~30 seconds.

image

Expected behavior

Code execution to start within a second.

Code to reproduce

/* eslint-disable import/first */
console.log('[start-scheduled-stages] Import modules')

import { startScheduledStages } from '@fastcup/backend-common/tournaments/stages/handle/schedule-stages'
import logger from '@fastcup/backend-common/log'
import { isTournamentAutomationEnabled } from '../utils/index.js'
import { initSentry } from '../utils/sentry.js'

initSentry()

logger.info('[start-scheduled-stages] Start')
if (await isTournamentAutomationEnabled()) {
    await startScheduledStages()
} else {
    logger.warn('Tournament automation is disabled')
}

logger.info('[start-scheduled-stages] done')

Checklist

shadowgate15 commented 2 years ago

What is the job configuration?

maxpain commented 2 years ago

@shadowgate15

const jobs = [
    { name: 'sync-tournament-streams', cron: '* * * * *' },
    { name: 'start-scheduled-matches', cron: '* * * * *' },
    { name: 'start-scheduled-rounds', cron: '* * * * *' },
    { name: 'start-scheduled-stages', cron: '* * * * *' },
    { name: 'start-scheduled-tournaments', cron: '* * * * *' },
    { name: 'start-scheduled-stage-check-ins', cron: '* * * * *' },
    { name: 'auto-accept-tournament-matches', cron: '* * * * *' },
    { name: 'cleanup', cron: '* * * * *' },
    { name: 'update-streams', cron: '*/5 * * * *' },
    { name: 'update-top-users', cron: '*/10 * * * *' },
    { name: 'update-friends-online', cron: '* * * * *' },
    { name: 'update-statistic-counters', cron: '* * * * *' },
    { name: 'statistic-minutely', cron: '* * * * *' },
    { name: 'statistic-hourly', cron: '0 * * * *' },
    { name: 'statistic-daily', cron: '0 0 * * *' },
    { name: 'statistic-monthly', cron: '0 0 1 * *' },
    { name: 'statistic-yearly', cron: '0 0 1 1 *' },
    { name: 'gamemoney-checkouts', cron: '* * * * *' },
    { name: 'pay-ladder-prizes', cron: '* * * * *' },
    { name: 'sync-workshop-maps', cron: '* * * * *' },
]

const bree = new Bree({
    root: path.join(path.dirname(fileURLToPath(import.meta.url)), 'jobs'),
    jobs,
    logger: breeLogger,
    closeWorkerAfterMs: 10 * 60 * 1000,
    errorHandler(error) {
        logger.error(error)
        sentry.captureException(error)
    },
})

bree.start()
shadowgate15 commented 2 years ago

I wonder if that is due to the number of workers that are being created at one time and causing a delay due to it loading all of those files.

Does it continue to have the delay after the first time the job runs? Also, do all of the jobs have that delay?

maxpain commented 2 years ago

Does it continue to have the delay after the first time the job runs?

Yes, I have consistent delays (from 15 to 30 seconds) in every launch of every job.

image
shadowgate15 commented 2 years ago

Does your logger use console? If it does, there is a known node issue with that. Try adding the time stamp in the logger string to determine that. It could potentially be that the awaits are actually taking that long to run and due to the above issue the logs are getting backed up and giving inaccurate time information.

maxpain commented 2 years ago

Seems the problem is that I have a lot of jobs, they have a lot of imports and when starting they consume CPU a lot. The code of a job itself doesn't consume the CPU that much, but Node imports do.

screen

Should I stop using Bree / worker_threads and run all my jobs in the same process?

shadowgate15 commented 2 years ago

hmm interesting. One solution could be to use longer running jobs. another solution might be to reduce the size of the imports so they load faster.

maxpain commented 2 years ago

I already use ES Modules, IDK how to make my modules load faster. Anyway, we use npm packages, which can be CommonJS modules and destroy import performance.

maxpain commented 2 years ago

Is there a way to cache imports in worker_threads?

titanism commented 2 years ago

Jobs would start slowly due to CPU and/or memory limitations. You may benefit from rewriting your jobs, or having a long-running job and then having another separate job that runs on a schedule (e.g. * * * * *) which sends to the parent a message that it's time to start another job, and then parent would then listen for this message, and then send to the other long-running job that it's time to do XYZ again.

You may benefit from increasing the swap on your server if it is a memory issue. See https://www.digitalocean.com/community/tutorials/how-to-add-swap-space-on-ubuntu-20-04 for more information.

If you are storing large files, e.g. huge JSON objects that you are use require or import with, you may benefit from having the parent worker send that payload to children through workerData.

Without seeing the source code of your actual jobs, we're unable to help further.

You can also benefit from using console timers to debug what is causing such high CPU and memory intense operations. It seems from your notes that you think it is the Node imports/require calls though. See https://developer.mozilla.org/en-US/docs/Web/API/console/time.