humanmade / Cavalcade

A better wp-cron. Horizontally scalable, works perfectly with multisite.
https://engineering.hmn.md/projects/cavalcade/
Other
512 stars 46 forks source link

Why does Cavalcade do so much on every page load? #116

Closed kadamwhite closed 1 year ago

kadamwhite commented 1 year ago

My understanding of Cavalcade's architectural model is that it was intended to remove the dependency on page-view-triggered behavior within WP-Cron. However, I'm seeing that a significant percentage of the processing needed to render any page view on an Altis site is spent in Cavalcade-related functions, DB calls, and cache interactions. Why does Cavalcade do so much on every page view?

This is a typical backtrace within the wp-redis wp_set_cache function when Cavalcade is firing:

 file                                                       function
---------                                                   --------
 vendor/humanmade/cavalcade/inc/class-job.php#131           wp_cache_set( "job::933984", HM\Cavalcade\Plugin\Job, "cavalcade-jobs" )
                                                            HM\Cavalcade\Plugin\Job->to_instance( stdClass )
 vendor/humanmade/cavalcade/inc/class-job.php#142           array_map( [string, string], [object] )
 vendor/humanmade/cavalcade/inc/class-job.php#377           HM\Cavalcade\Plugin\Job->to_instances( [object] )
 vendor/humanmade/cavalcade/inc/connector/namespace.php#320 HM\Cavalcade\Plugin\Job->get_jobs_by_query( [NULL, string, array, integer, ...] )
 wordpress/wp-includes/class-wp-hook.php#303                HM\Cavalcade\Plugin\Connector\pre_get_scheduled_event( null, "altis.analytics.long...", [], null )
 wordpress/wp-includes/plugin.php#189                       WP_Hook->apply_filters( null, [NULL, string, array, NULL] )
 wordpress/wp-includes/cron.php#748                         apply_filters( "pre_get_scheduled_ev...", null, "altis.analytics.long...", [], null )
 wordpress/wp-includes/cron.php#809                         wp_get_scheduled_event( "altis.analytics.long...", [] )
 vendor/altis/aws-analytics/inc/namespace.php#75            wp_next_scheduled( "altis.analytics.long..." )
 wordpress/wp-includes/class-wp-hook.php#303                Altis\Analytics\schedule_events( "" )
 wordpress/wp-includes/class-wp-hook.php#327                WP_Hook->apply_filters( null, [string] )
 wordpress/wp-includes/plugin.php#470                       WP_Hook->do_action( [string] )
 wordpress/wp-settings.php#578                              do_action( "init" )
 wp-config.php#152                                          require_once( "/usr/src/app/wordpre..." )
 wordpress/wp-load.php#55                                   require_once( "/usr/src/app/wp-conf..." )
 wordpress/wp-blog-header.php#13                            require_once( "/usr/src/app/wordpre..." )
 index.php#21                                               require( "/usr/src/app/wordpre..." )

In my local, 19 wp_set_cache requests are triggered on every page load—my mental model of how Cavalcade works may well be flawed, but querying for and then re-setting all of this data on every single request feels inefficient.

I have not fully reproduced this in a deployed environment, but can do so if needed.

kadamwhite commented 1 year ago

@kovshenin has noted that the cavalcade-jobs group should be non-persistent in Redis, https://github.com/humanmade/Cavalcade/blob/master/inc/namespace.php#L31 so theoretically these set commands won't make it through to Redis and will only be persisted in-memory. So, this isn't triggering as much cache writing as expected. My question remains, however, why are we querying for and processing all of this information in every request? Why does every request on the frontend need to have Cavalcade data loaded?

kovshenin commented 1 year ago

Why does every request on the frontend need to have Cavalcade data loaded?

I agree. However, I think this might happen because of altis/aws-analytics that's perhaps trying to make sure the events exist on a front-facing request, rather than wp-admin, plugin activation or another low-frequency request. Cavalcade simply obeys by querying its jobs and making sure the event is scheduled:

https://github.com/humanmade/aws-analytics/blob/aa5763c1df8e13ecb457b9db9a8deddded8ccd63/inc/namespace.php#L81-L89

kadamwhite commented 1 year ago

😬 @kovshenin Thank you, I had entirely overlooked that entry in the trace. I have opened humanmade/aws-analytics#490 and will move the discussion over there.