Open roborourke opened 1 year ago
Suggestion from @kovshenin
It is a good compromise, though I'd be careful returning true for all events. Instead maybe have an explicit list of events that you expect to be scheduled at all times, and let the checks for those only run within wp-admin.
This is also a good opportunity to improve our scheduled tasks docs, specifically suggest ways how to add single and recurring tasks, how and when to check wp_next_scheduled
and why it's better done during altis.migrate
instead of init
.
By default in altis/cloud we can short-circuit the core https check event, the privacy events, etc. that we know are most likely to already be scheduled. We can also try and move them to altis.migrate
if that makes more sense.
Implementing this filter from application code as described here doesn't seem to have any effect on whether Cavalcade queries are executed on the frontend, for me. In fact, returning either true
or false
seems instead to guarantee that there will also be a call to Job->save()
which triggers additional queries.
It seems like possibly Cavalcade runs so early, that I am not able to set filters from an mu-plugin which actually run during the cavalcade job lookup process?
My assumption was incorrect, but returning true
causes an error because cron.php
checks $returned_event->timestampif it is not null or
false`, so the only way I've reliably disabled an event looks like this:
/**
* Skip cavalcade scheduling on high-traffic frontend pages.
*
* @param null|false|object $pre Value to return instead. Default null to continue retrieving the event.
* @param string $hook Action hook of the event.
* @param array $args Array containing each separate argument to pass to the hook's callback function.
* @return null|false|object Returning a non-null value will short-circuit the normal process.
*/
function mytheme_skip_cavalcade_scheduled_event_on_high_traffic_pages( $pre, $hook, $args ) {
if ( is_admin() ) {
// Do not alter processing of events within the admin.
return $pre;
}
// Most of the events we want to skip run before any globals are set, so
// it is not straightforward to skip these only on is_single() or is_home().
// Skip relevant jobs on all frontend pages, for now.
$is_skippable_hook = in_array(
$hook,
[
'wp_https_detection',
'extendable-aggregator-sync-cron',
'wp_site_health_scheduled_check',
'wp_privacy_delete_old_export_files',
],
true
);
if ( $is_skippable_hook ) {
$fake_scheduled_event = new stdClass();
// Perpetually one hour in the future to defer any queries.
$fake_scheduled_event->timestamp = time() + HOUR_IN_SECONDS;
return $fake_scheduled_event;
}
return $pre;
}
add_filter( 'pre_get_scheduled_event', 'mytheme_skip_cavalcade_scheduled_event_on_high_traffic_pages', 10, 3 );
init
cannot be used as the sole restriction because the health check initiator runs before init.
Can we just return false?
I this is the only way would you mind updating the PR #713 ?
@roborourke if we return false
, Cavalcade treats the job as unscheduled, and does us the "favor" of querying to set it. That pumps additional jobs into the table and causes an INSERT on every page view.
If $pre
is not null then something else has already modified it, likely at an earlier priority and I think we should respect that and return early, I was thinking if ( is_admin() || ! is_null( $pre ) ) {
Can we just return false?
Returning false will also short-circuit the filter because there is an explicit check for null:
@kovshenin See above—it short-circuits the filter, but then the plugin treats that as an opportunity to "fix" the situation by setting the missing job. That results in more queries, which is not the goal.
Shortcircuiting is the point though? Cavalcade should be respecting any previous modifications to the value on that filter.
Sorry I should have been a bit more clear. Short-circuit with a false at the end tells Cavalcade that the none of the jobs exist, which is what we don't want, as demonstrated by @kadamwhite's screenshot. We only want to short-circuit for the jobs that we know, i.e. $is_skippable_hook
, and allow Cavalcade to proceed with the checks to fetch any remaining jobs from the DB, and add them to the schedule if absent.
@kovshenin The change I made which resulted in additional queries was to return false
where I am otherwise returning the fake scheduled notice, above.
if ( $is_skippable_hook ) {
return false;
}
I wasn't short-circuiting at the end, only for the jobs that we know. However, returning false
causes WP to treat the job as not scheduled — which, as I've attempted to illustrate, means that the health check then schedules the event, unnecessarily.
if ( ! wp_next_scheduled( 'wp_site_health_scheduled_check' ) && ! wp_installing() ) {
wp_schedule_event( time() + DAY_IN_SECONDS, 'weekly', 'wp_site_health_scheduled_check' );
}
Returning false
doesn't skip processing of that hook, it instead notifies that the event must be scheduled. That results in more queries overall. The only way to skip both looking up, and scheduling an event, is to return an object with a non-falsy timestamp
property so that each of the functions which check for a scheduled event will decide no further action is needed.
Ah, apologies, I thought @roborourke's comment was referring to that last return $pre;
. Returning an object with a timestamp sounds correct.
Per this issue on the Cavalcade repo when db latency is a factor the common pattern of checking for and scheduling cron jobs is often done on
init
. WP core does this as well many 3rd party plugins etc...In theory we could avoid this altogether using the preflight filter to limit these checks to the admin only, provided they are added during the
init
hook:Events will still be scheduled but on admin requests only.
This would cause issues if plugins are adding events on the frontend using the
init
hook based on user interactions and$_GET
or$_POST
parameters, although that would be far more likely dealt with not on theinit
hook.This approach seems safe to me, especially if we also check that
$_POST
is empty. Would be great to get some additional input and feedback though.