mjphaynes / php-resque

php-resque is a Redis-backed PHP library for creating background jobs, placing them on multiple queues, and processing them later.
MIT License
221 stars 50 forks source link

Properly Terminating a Background Job #119

Open ronzyfonzy opened 11 months ago

ronzyfonzy commented 11 months ago

I have a worker that is executing multiple queues. Below is a simplified version of my code:

class JobTest extends BgJob
{

    public function perform($args)
    {
        parent::parseArgs($args);
        $this->_failLongRunningJobs("common-queue");
        // execute my job code
    }

    /** @param string $queue */
    private function _failLongRunningJobs($queue): void
    {
        $jobs_max_running_times = [
            "default"            => 7200,
            JobDeviceSync::class => 300,
            JobTest::class       => 60,
        ];

        /** @var \Resque\Job[] $running_jobs */
        $running_jobs = $this->_getQueueJobList($queue, "running");

        foreach ($running_jobs as $job) {
            $job_data = $job->getData();
            $started  = to_int(array_get($job_data, "started", 0));
            if (array_key_exists($job->getClass(), $jobs_max_running_times) and $started > 0) {
                $execution_time = time() - $started;
                $time_limit     = array_get($jobs_max_running_times, $job->getClass(), null);

                if ( ! is_null($time_limit) and $execution_time > $time_limit) {
                    $e = new Exception(
                        "Detected long running job {$job->getId()}. Running for {$execution_time} seconds."
                    );
                    $job->fail($e);
                }
            }
        }
    }
}

I've implemented a method _failLongRunningJobs() to check and fail long-running jobs, but this only updates the jobs data in Redis (the worker child process is still executing).

Is there a recommended way to safely interrupt and terminate a job that's currently being executed?

Thank you so much for the time invested in this project - it is much appreciated!

xelan commented 11 months ago

Hi @ronzyfonzy, thanks for asking your question here. I hope that I understood your question correctly. Generally, jobs may throw a Resque\Exception\Cancel exception to signal that the long-running job is deliberately interrupted/cancelled, or as in your use case they can also be terminated externally. You may need to listen to the Event::JOB_FAILURE event in your job to ensure it stops when $job->fail($e); is called. As you have currently implemented it (watching for long running jobs in the job itself), it doesn't look right. I'd say that the class enqueueing or managing the jobs should perform these checks.

How to "safely" interrupt a job depends on the tasks it is performing. As that may be a lot of different things (e.g. filesystem and network operations) handling a possible interruption is IMHO the responsibility of the job and out of scope for php-resque. Depending on the task, various cleanups or other steps to properly stop your job may be necessary.

Hope this helps😊

Best regards Andreas

ronzyfonzy commented 11 months ago

@xelan thank you for your response. This helps a lot.

Just for some additional clarification. Let's say I have a worker (workerId=1) which is performing a job (jobId=1). Now in another worker (workerId=2) I start executing another job (jobId=2). This second job (jobId=2) cancels the first one (jobId=1). Am I able to create a listener so that jobId=1 would receive it? Or should the workerId=1 have that implementation?

Thank you in advance Robert