WordPress / performance

Performance plugin from the WordPress Performance Group, which is a collection of standalone performance modules.
https://wordpress.org/plugins/performance-lab/
GNU General Public License v2.0
367 stars 101 forks source link

Offloading Google Analytics (gtag) to a Web Worker #1455

Open westonruter opened 3 months ago

westonruter commented 3 months ago

Feature Description

The Web Worker Offloading plugin is now merged into trunk (https://github.com/WordPress/performance/pull/1247). However, for it to be utilized a plugin author has to manually make the web-worker-offloading script a dependency of the script intended to be offloaded. As mentioned in https://github.com/WordPress/performance/pull/1247#issuecomment-2249065318, we should look for opportunities to offload existing scripts automatically as much as possible. From Partytown's Common Services page, they list out 3rd party services known to be compatible. Included on that list is Google Tag Manager, which I've found to be responsible for INP issues on highly-trafficked sites. Partytown has documentation for how to integrate with Google Tag Manager.

The challenge for automatically implementing the Partytown integration with Google Tag Manager is that the plugins (and themes) which add gtag do so in inconsistent ways. The scripts could be registered properly via wp_enqueue_script() but the handles won't be consistent. Or a plugin may manually print the script at wp_head.

In the case where a plugin properly enqueues the script, we can look at WPdirectory for the most common script handles used. Otherwise, for manually-printed scripts the alternative would be for Web Worker Offloading to integrate with Optimization Detective to register a tag visitor that looks for a SCRIPT tag with a src pointing to https://www.googletagmanager.com/gtag/js, and when present, inject the Partytown script (if not already present) and update the SCRIPT tag attributes as necessary.

In looking at WPdirectory for plugins with at least 100K installs, the script handles used are:

Plugin Handle Install Count Source
WooCommerce (Google Analytics for WooCommerce) google-tag-manager 5,000,000 (200,000) Trac (Trac)
Site Kit by Google google_gtagjs 4,000,000 Trac
Rank Math SEO google_gtagjs 2,000,000 Trac

These are not using GTM:

image

These add up to 11.2 million installs.

The following plugins print the script without using WP_Scripts:

Plugin Install Count
Jetpack (legacy module) 4,000,000
Complianz – GDPR/CCPA Cookie Consent 900,000
Google Listings & Ads 900,000
LightStart 700,000
GA Google Analytics 600,000
SEOPress 300,000
Blocksy Companion 200,000
WooCommerce Checkout & Funnel Builder by CartFlows 200,000
Orbit Fox by ThemeIsle 200,000
CTX Feed 100,000
SEO Plugin by Squirrly SEO 200,000
VK All in One Expansion Unit 100,000

Excluding the Jetpack legacy module, these total up to 4.4 million installs.

So by starting out just targeting scripts registered via WP_Scripts, we can handle ~72% of the installs.

So the only script handles used are google-tag-manager and google_gtagjs. However, we could discover other script handles by looping over wp_scripts()->registered to find any other dependencies with a src beginning with https://www.googletagmanager.com/gtag/js at runtime. We can then add web-worker-offloading as a dependency for this registered script.

However, adding the script dependency is half of what we need to do. We also need to make sure that the inline script also gets the text/partytown type:

  <script type="text/partytown">
    window.dataLayer = window.dataLayer || [];
    window.gtag = function gtag(){dataLayer.push(arguments);}
    gtag('js', new Date());

    gtag('config', 'YOUR-ID-HERE');
  </script>

WooCommerce adds this as an inline after script. As does Site Kit. It seems that "Google Analytics for WooCommerce" adds another script with its own inline script, so it may not be as straightforward to offload to a worker.

WC_Google_Gtag_JS::enquque_tracker() [sic] method in woocommerce-google-analytics-integration/includes/class-wc-google-gtag-js.php ```php public function enquque_tracker(): void { wp_enqueue_script( 'google-tag-manager', 'https://www.googletagmanager.com/gtag/js?id=' . self::get( 'ga_id' ), array(), null, array( 'strategy' => 'async', ) ); // tracker.js needs to be executed ASAP, the remaining bits for main.js could be deffered, // but to reduce the traffic, we ship it all together. wp_enqueue_script( $this->script_handle, Plugin::get_instance()->get_js_asset_url( 'main.js' ), array( ...Plugin::get_instance()->get_js_asset_dependencies( 'main' ), 'google-tag-manager', ), Plugin::get_instance()->get_js_asset_version( 'main' ), true ); // Provide tracker's configuration. wp_add_inline_script( $this->script_handle, sprintf( 'var wcgai = {config: %s};', wp_json_encode( $this->get_analytics_config() ) ), 'before' ); } ```

But in the case of WooCommerce and Site Kit doing things in a more straightforward case: in order to add the type="text/partytown" to the inline script, we can do so via filtering wp_inline_script_attributes. For example:

add_filter( 'wp_inline_script_attributes', static function ( $attributes, $data ) {
    if (
        array_key_exists( 'id', $attributes ) &&
        in_array( $attributes['id'], array( 'google-tag-manager-js-after', 'google_gtagjs-js-after' ), true ) &&
        str_contains( $data, 'dataLayer' )
    ) {
        $attributes['type'] = 'text/partytown';
    }
    return $attributes;
} );
westonruter commented 3 months ago

This integration wouldn't make sense for core merge, so it should be put into a separate file perhaps in an integrations directory. If the functionality is integrated into core, then the plugins would be expected to do the necessary changes themselves (which should be less than is currently required to shim the integrations, especially for the online scripts).

gutobenn commented 3 months ago

@westonruter Nice!

There's also the GTM4WP - A Google Tag Manager for WordPress plugin, which has 700,000 installs.

westonruter commented 3 months ago

I just realized that the inline scripts will be problematic for the current implementation of Web Worker Offloading, since it automatically blocks offloading to a worker when there are any inline after scripts (or, importantly, any blocking dependent scripts), and it does this indirectly by setting a strategy so that core's private filter_eligible_strategies method will remove the async or defer attribute when when an inline after script is present.

Given that we do actually need to opt-in scripts with after scripts to be offloaded to the worker, I wonder if this was the right way to go? Should the code be replaced with logic instead to focus on setting the type="text/partytown" attribute of the inline after script for a script that has a web-worker-offloading dependency.

We faced this same struggle when implementing the script strategies, without ever implementing support for delayed inline after scripts. See Core-58632. We opted to err on the side of caution to not cause breakages. We considered adding a $standalone argument to add_inline_script() to say that it was safe for non-blocking execution, but we didn't end up implementing that.

westonruter commented 3 months ago

There's also the GTM4WP - A Google Tag Manager for WordPress plugin, which has 700,000 installs.

@gutobenn Thanks for that. I see it is not included in my search because it doesn't include the https://www.googletagmanager.com/gtag/js string, but rather is using yet another way to inject the tag asynchronously with an inline script where the script src will ultimately be //www.googletagmanager.com/gtm.js (which is a 404, so apparently instead of gtm.js it is whatever is contained by $gtm4wp_options[ GTM4WP_OPTION_GTMCUSTOMPATH ]):

https://plugins.trac.wordpress.org/browser/duracelltomi-google-tag-manager/tags/1.20.2/public/frontend.php#L1122

gutobenn commented 3 months ago

@gutobenn Thanks for that. I see it is not included in my search because it doesn't include the https://www.googletagmanager.com/gtag/js string, but rather is using yet another way to inject the tag asynchronously with an inline script where the script src will ultimately be //www.googletagmanager.com/gtm.js (which is a 404, so apparently instead of gtm.js it is whatever is contained by $gtm4wp_options[ GTM4WP_OPTION_GTMCUSTOMPATH ]):

Ahhh, nice find!

I did some tests and it seems that www.googletagmanager.com/gtm.js returns a 404 error only if you do not provide valid values for the required parameters id and _gtmauth (e.g., https://www.googletagmanager.com/gtm.js?id=GTM-XXXXXXX&gtm_auth=XXXXXXXXXXXX).

oxyc commented 2 months ago

Has this actually been tested?

We tried running it some year back but eventually gave up due to:

  1. GTM preview needs a proxy https://github.com/BuilderIO/partytown/issues/72
  2. Without atomics there was noticeable lag
  3. debugging GTM and partytown proved very difficult since GTM is an uncontrolled black box and partytown is very complex

looking at their issue tracker i can see there are plenty of these kinds of bugs that for many sites are simply not allowed to happen no matter what https://github.com/BuilderIO/partytown/issues/583

We chose that reliable tracking was more important than the performance gain. Super excited if this does end up working out though, I just wanted to check a POC has been tested.

westonruter commented 2 months ago

@oxyc Thanks a lot for this valuable feedback. @thelovekesh has been doing some testing, but this hasn't been completed yet. Ultimately, of course, the third-party script providers should be doing this web worker offloading themselves so as to not require Partytown's shims in the first place. But I have been hopeful the Partytown would provide a way to jumpstart this. But your findings may prove Partytown to be currently infeasible.

westonruter commented 2 months ago

@oxyc oh, actually, the scope of this issue is not for GTM generally, but rather for "gtag" for Google Analytics which (confusingly) is loaded from www.googletagmanager.com. As I understand, Partytown is specifically having trouble with GTM not Google Analytics via gtag.js.

felixarntz commented 1 month ago

@westonruter Is there a particular way I can help with this? Any questions you have?

For reference:

Relevant classes used include:

westonruter commented 1 month ago

@felixarntz Thanks! I guess the main thing would be how best to set up a local development environment to get the gtag script to be output. With WooCommerce it was simply a matter of activating the plugin and adding a random Google Analytics ID. With Site Kit I'd want to avoid connecting it to an actual Google account to test. I'm sure there are some established methods for local development, so I'd love to be pointed in that direction.

felixarntz commented 1 month ago

With Site Kit I'd want to avoid connecting it to an actual Google account to test. I'm sure there are some established methods for local development, so I'd love to be pointed in that direction.

As far as I remember, the local development workflows still rely on being connected to an actual Google account. There are several security requirements in place and working around them would be so tedious that back then it was not deemed worthwhile to spend hours of development on it - also because a lot of developing with Site Kit is about the Google service integrations, so developing without being connected to the services would have us more likely miss potential issues.

What I personally do when developing for it is set up a local site that has the same site URL as my live site where I used Site Kit with various services. That's a bit of a strange development setup, but the easiest to set up and most reliable to work with.

Alternatively, you can use the official helper plugin, which allows you to run this on any local site URL, but you'll still need to add the URL of an actual site using Site Kit with your Google account into that helper plugin's UI, which will make Site Kit behave more or less as if that was the site you're using. It will also require setting up a custom OAuth app, which means you won't be using Site Kit Service - which should be an okay limitation for what we're trying to do here, since it doesn't tie into the service at all. See https://sitekit.withgoogle.com/documentation/using-site-kit/staging/ for relevant instructions if you want to go with that approach.

westonruter commented 1 week ago

I've got two PRs open now for both Rank Math and Site Kit:

The Site Kit one is blocked by a necessary upstream change related to Consent Mode.