WordPress / performance

Performance plugin from the WordPress Performance Group, which is a collection of standalone performance modules.
https://wordpress.org/plugins/performance-lab/
GNU General Public License v2.0
346 stars 94 forks source link

Unclear purpose for tag visitors returning true or false in Optimization Detective #1342

Closed westonruter closed 1 week ago

westonruter commented 1 month ago

When registering a tag visitor in Optimization Detective, the callable must return a boolean. When the boolean returns true, then this is an indication that the tag was indeed "visited" or rather that the visitor is relevant to the current tag. For example:

add_action( 'od_register_tag_visitors', static function ( OD_Tag_Visitor_Registry $registry ) {
    $registry->register( 'async_images', static function ( OD_Tag_Visitor_Context $context ) {
        if ( $context->processor->get_tag() !== 'IMG' ) {
            return false;
        }
        if ( $context->processor->get_attribute( 'decoding' ) !== 'async' ) {
            $context->processor->set_attribute( 'decoding', 'async' );
        }
        return true;
    } );
} );

Clearly this visitor should return false if the current tag is not an IMG. But should it still also return false if it already has a decoding=async attribute?

In reality, when a visitor returns true this is a signal for whether or not the tag should be captured to be stored in the URL Metrics:

https://github.com/WordPress/performance/blob/4ff08f98a6e709e03068e20c5560599e98cf4afd/plugins/optimization-detective/optimization.php#L180-L192

So when a visitor returns true then this means that $did_visit will be true and if the URL Metrics for current URL are not complete (if $needs_detection is false), then the effect is that it will add the data-od-xpath attribute to the tag. What does this do? It means that when detect.js runs, it will include the element among all of the elements for which it captures metrics, including via IntersectionObserver and web-vitals.js:

https://github.com/WordPress/performance/blob/726f10c652628a0c0e577ffd2a39395913047c20/plugins/optimization-detective/detect.js#L242

https://github.com/WordPress/performance/blob/726f10c652628a0c0e577ffd2a39395913047c20/plugins/optimization-detective/detect.js#L285-L287

However, in the case above for a tag visitor that just makes sure that all img tags have decoding=async, there is no need for this tag to ever be captured in the URL metrics because the URL metrics do not impact how the applied optimization in any way. This is in contrast with what Image Prioritizer does with applying lazy-loading and fetchpriority=high, where it depends on the URL Metrics to know what optimization to perform. Embed Optimizer also depends on URL Metrics to know whether an embed should be lazy-loaded or instead should have preconnect links added. Conversely, the auto-sizes integration in #1322 doesn't depend on the URL Metrics and so at present its tag visitor could always return false.

In short, I think the explanation of what the boolean means being returned from a tag visitor. It's not whether the tag was visited, but rather whether or not the tag needs to be tracked in the URL Metrics.

It could also be possible to automatically detect whether or not a tag visitor depends on URL metrics by detecting whether or not it accessed the $context->url_metrics_group_collection. If so, then this would be a signal that indeed the tag should be tracked in the URL metrics, and the return value of the tag visitor could be eliminated entirely. However, this could be a bit too sophisticated.

adamsilverstein commented 1 month ago

Clearly this visitor should return false if the current tag is not an IMG. But should it still also return false if it already has a decoding=async attribute?

All core output images will have decoding=async by default.

In short, I think the explanation of what the boolean means being returned from a tag visitor. It's not whether the tag was visited, but rather whether or not the tag needs to be tracked in the URL Metrics.

That makes sense.

westonruter commented 1 month ago

All core output images will have decoding=async by default.

Yes, this is just an example.

joemcgill commented 1 month ago

In short, I think the explanation of what the boolean means being returned from a tag visitor. It's not whether the tag was visited, but rather whether or not the tag needs to be tracked in the URL Metrics.

What determines whether the tag needs to be tracked in the URL Metrics? If any modifications have been made by the visitor at all, or are there specific types of changes that need to return true?

In other words, would we want to return true in your example if decoding=async was added by the visitor?

westonruter commented 1 month ago

The tag would need to be tracked in URL metrics if the tag visitor needs to obtain any metrics for the element in the URL metrics. In the example case here with decoding=async, there's no need to return true because it doesn't need to be captured in URL metrics since the visitor doesn't apply any conditional optimizations. This is in contrast, for example, with the Embed Optimizer which does conditional optimizations based on whether the embed is in the initial viewport:

https://github.com/WordPress/performance/blob/0ffc4149524d03d7165ab6a20f698a73ef73be99/plugins/embed-optimizer/class-embed-optimizer-tag-visitor.php#L37-L120

westonruter commented 1 month ago

So basically, if a tag visitor accesses $context->url_metrics_group_collection then it should return true.

westonruter commented 1 month ago

So that's why I mentioned this:

It could also be possible to automatically detect whether or not a tag visitor depends on URL metrics by detecting whether or not it accessed the $context->url_metrics_group_collection. If so, then this would be a signal that indeed the tag should be tracked in the URL metrics, and the return value of the tag visitor could be eliminated entirely. However, this could be a bit too sophisticated.

So we could detect whether a tag visitor accesses $context->url_metrics_group_collection, or we could have an explicit method on the $context object like record_element(), track_tag(), observe(), etc. which would do this explicitly.

joemcgill commented 1 month ago

I like the idea of making this behavior explicit.

I don't fully understand the intended semantics of the current API, but I wonder if we need to modify the API for registering visitors to include an signal (e.g., property, arg, option, etc) whether it uses URL metrics? That way we could separate the return value indicating whether the visitor optimization was performed from whether the tag needs to be tracked for user metrics.