humanmade / Cavalcade

A better wp-cron. Horizontally scalable, works perfectly with multisite.
https://engineering.hmn.md/projects/cavalcade/
Other
517 stars 46 forks source link

Add `stdout` & `stderr` logging for jobs #24

Closed dd32 closed 3 years ago

dd32 commented 7 years ago

Currently the only way to get job failure data is to look at the main cavalcade output logs, this is useful when Cavalcade is only running on a single server, or a bunch of servers with a remote syslog, but when you need to find the failure data for a job that was run on one of many different servers it requires digging through multiple server logs to find the actual data.

It'd be ideal if instead of/in addition to logging Failed to shutdown worker. that the stdout and stderr streams were logged to the database.

WordPress.org currently runs Cavalcade with a modified Runner::check_workers() so that we log the server on which the job ran, we could add this functionality ourselves but Job->output and Job->error_output are both protected fields which prevents this. Unfortunately there isn't an easy way to override the Worker class so we can't simply extend it as we've done with the Runner (And if we wanted to put stderr/stdout into their own field in the table, rather than a blob of data into content we'd also need to override Logger which again, isn't easily possible).

rmccue commented 7 years ago

We should make the relevant pieces overridable here, and ensure extensibility is supported in a core way. I'm not sure if it makes sense to log the full stdout/stderr however; the Cavalcade logs table is already pretty large. Potentially if we only store it for errors, it may be more feasible.

dd32 commented 7 years ago

On WordPress.org, the relevant section of the runner now looks like this:

if ( ! $worker->shutdown() ) {
    $worker->job->mark_failed();
    $logger->log_job_failed( $worker->job, json_encode( array_filter( array(
        'server'   => php_uname('n'),
        'error'    => 'Failed to shutdown worker.',
        'stdout'   => $worker->output,
        'stderr'   => $worker->error_output
    ) ) ) );
} else {
    $worker->job->mark_completed();
    $logger->log_job_completed( $worker->job, json_encode( array_filter( array(
        'server'   => php_uname('n'),
        'stdout'   => $worker->output,
        'stderr'   => $worker->error_output
    ) ) ) );
}

In order to do that, I added a custom alteration to Job to alter the visibility of the output/error_output properties to public.

You're right though, this table can grow in size very quickly, and we're likely going to be adding a job to purge older log data.

vishaldodiya commented 7 years ago

@dd32 When do we get 'Failed to shutdown worker' error?

I have very long mysql updation process. which continuously update data in data base ( it is just one cron from which updation process are lopped ) . Is there any time limit for scheduled process. It got failed half an hour later, when the process started.