WordPress / performance

Performance plugin from the WordPress Performance Group, which is a collection of standalone performance modules.
https://wordpress.org/plugins/performance-lab/
GNU General Public License v2.0
354 stars 95 forks source link

Internal state of the page (e.g. queried posts or queried object) should factor into whether a stored URL metric is fresh #1466

Open westonruter opened 1 month ago

westonruter commented 1 month ago

This was originally brought up in https://github.com/WordPress/performance/pull/878#discussion_r1393132794.

There is a TODO currently in od_get_normalized_query_vars():

https://github.com/WordPress/performance/blob/73f223598df758e507dcfd14e09681e156a850d2/plugins/optimization-detective/storage/data.php#L59

Consider a homepage that shows 10 posts. None of the posts have featured images. URL metrics have been fully populated for the page in this state. Then someone publishes a new post that includes a featured image. All of a sudden there is a new image on the page which may be LCP, and yet the URL metrics have no record of this element. Nevertheless, by default the URL metrics are considered fresh by default for 1 day, which means it could take a day for that new image to start getting optimized.

One way to deal with this would be to include the IDs of the posts in the query among the normalized query vars. This could similarly handle the case where the static front page is changed. For example, in od_get_normalized_query_vars():

if ( is_singular() ) {
    $normalized_query_vars['queried_object_id'] = get_queried_object_id();
} else {
    $normalized_query_vars['queried_post_ids'] = join( ',', wp_list_pick( $GLOBALS['wp_query']->posts, 'ID' ) );
}

When the query vars change, this results in an entirely separate od_url_metrics post being used, as the normalized query vars are passed into od_get_url_metrics_slug() to obtain a MD5 hash which is used as the post_name.

Nevertheless, this seems somewhat of an abuse of the normalized query vars here, since these new "query vars" aren't actually part of the query at all. In reality, a new URL metrics post shouldn't be created, as it should instead update the existing post. If it creates a new post, then the old one will stick around until it is garbage-collected after a month.

Instead of creating a new URL metrics post entirely, perhaps there should be some kind of "ETag" which is stored with each URL metric. This also is proposed in #1424 where the list of registered tag visitors should factor into the composition of the ETag for a URL metric so that when a new tag visitor is registered (e.g. after a plugin is activated), then all URL metrics would immediately be considered stale so that new ones can be collected.

This is also related to another TODO:

https://github.com/WordPress/performance/blob/73f223598df758e507dcfd14e09681e156a850d2/plugins/optimization-detective/optimization.php#L244

Instead of having to manually collect the various parts of page state that may affect the elements stored in URL metrics, we could instead know by time we finish iterating over the document whether the elements we saw correspond to the elements which were stored in the most recent URL metric. If not, then the URL metric should be considered stale and new URL metrics should be collected. But there is a bit of a chicken-and-egg problem here because the REST API endpoint will reject any requests when the URL metrics are fresh. Since the REST API endpoint doesn't know the result of iterating over the page during optimization, it doesn't know that new URL metrics should in fact be allowed. Also, if the URL metrics were all fresh then the requisite data-od-xpath attributes aren't added to the output for detect.js to use when collecting the URL metrics. So once it gets to the end of iterating over all the tags, it would have to go back over them from the beginning and add these attributes. Or it could add them unconditionally on the first pass, and then strip them all out when not needed during a second pass (but this seems not ideal to have to do with every response). This is opposed to the current approach to conditionally add them:

https://github.com/WordPress/performance/blob/73f223598df758e507dcfd14e09681e156a850d2/plugins/optimization-detective/optimization.php#L229-L231

Several things to ruminate on here.

westonruter commented 1 month ago

The current template that was selected should also be considered as part of the ETag. If a theme all of a sudden is updated so that there is a category.php template file that overrides the generic category.php then likely this should result in the URL metrics being considered stale.