Closed headerbidding closed 1 year ago
Hi Hugh,
It seems like you've done a lot of work to try to improve this and I can really appreciate how frustrating it must be after all that effort to be no further forward.
The good news is that this is explainable. The bad news is you may not like the explanation, and the fix is tougher 😔
First, let's correct a couple of assumptions here:
Cross-origin resources: “LCP media served from other domains do not give render time in the PerformanceObserver API—unless the Timing-Allow-Origin header (TAO) is provided.” I noticed this one only just now and have now added a TAO to my .htaccess file.
You cannot fix this within your .htaccess file as the header needs to be sent on the domain sending the image. If it's the same origin it doesn't need the header, if it's a different origin you likely won't have access to add this (the one exception is if you are in control of the other domain - say it's an asset domain like assets.example.com that you serve images from).
However, TAO isn't the main problem here, so let's move into the tricker issue:
CrUX does measure metrics within iframes, Core Web Vitals javascript does not : All my iframes are wrapped with div's defined in width and height and overflow: hidden to avoid any CLS
The problem isn't that the iframe itself may move - it's that content within the iframe may shift. Causing CLS.
Web APIs (as used by web-vitals.js and other tools) cannot "see" into iframes. This is a fundamental security restriction for the web platform.
This is noted in the documentation your referenced, and also in the Limitations section of the web-vitals repo:
The primary limitation of these APIs is they have no visibility into
CrUX however is measured at a lower-level in the browser where it has access to the full page.
CrUX is more accurate here as the average user has no idea whether content is in an iframe or not and just sees it as page content. So CLS within an iframe is attributed to the top page. Similarly LCP can be either on the main frame, or within an iframe. As Core Web Vitals are intended to measure the user experience, it is correct to measure including iframes.
So let's look at your issues.
First up LCP issue
Running that page through WebPageTest you see the following https://www.webpagetest.org/result/221206_AiDcEC_8P4/ with an LCP of 2.259 seconds (the LCP frame is highlighted in red):
However, you can see the largest element (the video) is not drawn until a good bit later at 7, 8, or even 9 seconds depending which element is the LCP (video's are not eligible for LCP at present, but a static image that YouTube displays before you click play is).
So you can see here clearly that the Web APIs (used by WebPageTest, as well as web-vitals.js) are limited in what they can do on pages with iframes and here are showing the incorrect LCP time - far earlier than when the user sees it.
So what can you do to "solve" this? Having the LCP element within an iframe is going to be tough. You need to load the main document and then load the iframe meaning you are starting from a back foot. This is especially tough for video sites such as yours, when they serve the videos from another platform like YouTube.
To improve this, you could look at your TTFB, which looks very high (2.5 seconds on mobile, and 2 seconds on desktop). Given you want LCP within 2.5 seconds, and have the added issue of your LCP elements being videos, having such a slow TTFB is making a hard task impossible.
How are your pages generated and are there any improvements you can do on your server to make this quicker?
You don't appear to be using a CDN, and using one of them will allow your content to be served closer to users. Looking at Treo.sh you appear to have a globally spread audience so would benefit from a CDN.
Your pages are also ineligible for the bfcache, which is another way to speed up back and forward navigations (to make them instant), which could help your overall page metrics (back/forward navigations make up 20% of mobile navs, and 10% of desktop navs on a typical site). In part this is because of an unload handler added by your "add this" widget. If there was a way to remove that it would enable this "free performance enhancement". The other issue is the YouTube video is similarly added an unload handler to its own frame - the Chrome is working on this, but in meantime the next thing might help with that.
The other option to improve LCP is to NOT have a third-party iframe as the LCP element. Obviously that's what people expect from your site, and I wouldn't suggest self-hosting the videos as videos are complicated (and expensive!) to manage, but you can use a locally hosted image as a facade, and only load the video when the user clicks on it. There are a number of such components that allow you to do this (for example, Lite-youtube-embed or lite-youtube). Appreciate this takes a bit more work than just embedding a youtube video but will be more performant, and also potentially allow you to use the bfcache more.
On to your CLS issue
This one is similar to the above. The CLS is happening in the iframe, and so is invisible to the web APIs.
Dev Tools, does have some limited visibility into CLS (the reasons are complicated and I don't fully understand them myself!), and if you open the Performance panel, play the video (fast forward to about 10 seconds from the end) and then record a performance trace for the last 10 seconds, you see a CLS recorded when the video finishes and YouTube displays the next videos.
This looks to be an issue with YouTube and little that you can control (a colleague has raised it with them internally). It also does not appear to be a user visible CLS - it doesn't appear to "shift" content to me, so may be a false positive.
There is not much you can do on this one for now, but at least you hopefully understand it now. Will see if we can get the YouTube team to fix it.
Conclusion
Wow, so that was quite a detailed deep dive into your issues. As I said at the start you may not like the explanation, and I appreciate that some of this is not within your control, but hopefully you at least understand it better now. The Web Vitals program is intended to measure ALL user experiences, and for the LCP case certainly, I believe CrUX is measuring the "right thing" and the other tools are (through no fault of their own) unable to do this. IFrames are complicated and have security restrictions unfortunately.
It's also important to note that Core Web Vitals does not give any special treatment to YouTube over anyone else. So, like many other providers, they also have work to do to ensure that they don't hinder website's performance. On the plus side, the YouTube team are working on performance as recently showed in this post for their main site. Hopefully some of those learnings will also be able to be made on the embed that other sites use.
Anyway, let me know if that answers your questions (even if it's not the answer you wanted to hear), and if you have any other questions. For now I will close this issue as it's not a "CrUX" issue.
Hi Barry,
Thank you so much for your explanation for the LCP and CLS discrepancies. I really appreciate your help.
Putting up a “façade” for YouTube embed to speed up LCP, would worsen the user experience, so I won’t do that .
And I cannot do anything about the CLS within the YouTube frame.
So I will focus on increasing TTFB.
I have already started to experiment with Cloudflare running on a parallel domain (flixxy.us) but Cloudflare, even with cached static html, does not show much improvement in the reprts (see Treo report in attached pdf).
This brings me to another discrepancy I did not mention yet:
CrUX reports a TTFB of 2.5 seconds (p.75) (see attached pdf)
Treo.sh shows an FCP in the range from 0.3 (US East) to 1.9 (Australia) measured by Lighthouse
GTMetrix shows a TTFB of 263ms from Vancouver Canada.
I have not yet figured out how to get a TTFB report from web-vitals-script+GA4+BigQuery, but GA-4 Realtime (see attached) shows that 70% of my users experience a TTFB rating of “good” (i.e. less than 800 ms) which is nowhere near the 2500 ms CrUX reports.
How do you explain the TTFB discrepancy between Lighthouse, RUM and CrUX?
Which tool should I use to measure TTFB that gives me instant feedback (rather than having to wait 28 days to see if a change is working or not)?
Again, I really appreciate your help!
Hubert
Hey Hubert,
Putting up a “façade” for YouTube embed to speed up LCP, would worsen the user experience, so I won’t do that.
I wouldn't necessarily dismiss this as quickly as that. On slower networks and devices the load wait can seem quite long, and a facade may be a better experience. As an example of how seamless a facade can be, check out an example here - yes we use facades for our videos on web.dev! You can also see the facade simulates the "play" button of a normal YouTube video to make it more seamless to the user.
So I will focus on increasing TTFB. I have already started to experiment with Cloudflare running on a parallel domain (flixxy.us)
That I think is a good thing to do regardless, given your global nature.
but Cloudflare, even with cached static html, does not show much improvement in the reprts (see Treo report in attached pdf). This brings me to another discrepancy I did not mention yet: CrUX reports a TTFB of 2.5 seconds (p.75) (see attached pdf) Treo.sh shows an FCP in the range from 0.3 (US East) to 1.9 (Australia) measured by Lighthouse GTMetrix shows a TTFB of 263ms from Vancouver Canada. I have not yet figured out how to get a TTFB report from web-vitals-script+GA4+BigQuery, but GA-4 Realtime (see attached) shows that 70% of my users experience a TTFB rating of “good” (i.e. less than 800 ms) which is nowhere near the 2500 ms CrUX reports. How do you explain the TTFB discrepancy between Lighthouse, RUM and CrUX?
Ah, you're gonna hate me, but here's some more naunces here to consider, due to what TTFB means. TTFB is measured from when the user starts to navigate until the first byte of the page starts to return - that means it includes any redirect times.
Lighthouse deliberately does not use the term TTFB because in Lighthouse we typically measure the server time, rather than TTFB, after normalising the URL from any redirects. It is also simulated based on a predefined networks which may be slower (or faster!) than users really experience. Not to mention running Lighthouse from a US or UK data centre may be faster, than someone browsing from a poor network, on a train, in the middle of the countryside, or in far away places with poorer network connectivity (think Australia, or India where I noted you seem to have a lot of users).
Say for example a user starts from a Twitter link, it will go through Twitter's URL shortener (t.co), and then redirect to the actual URL. That redirect time is what the user experiences so should be included in total TTFB. But you may be missing that when trying the URL directly in Lighthouse or GTMetrix. Similarly if you use a URL shortener yourself, or use a Ads that redirect before ending up on your website.
Web APIs are also restricted in being able to see this "total TTFB" time, for privacy and security reasons. So web-vitals.js only gets a partial view here too.
CrUX however, again as it's working at a lower level rather than using browser web APIs sees the full amount.
Which tool should I use to measure TTFB that gives me instant feedback (rather than having to wait 28 days to see if a change is working or not)? Again, I really appreciate your help!
Well here's where you get into the limitations of CrUX. CrUX has the most accurate data, for the reasons described above, but as it's a public resource, it's limited in what it can show without revealing more detailed browsing habits of your site and your users. That is part of the reason for the 28-day lag, and it is also to try to smooth out some of the variability in field data to try to give a true sense of the performance of your website, rather than swing up and down depending on brief spurts of good or bad traffic, as is the nature of any site.
So the answer is both. CrUX will give the best results, but with limitations. A RUM solution (including a home-grown one like web-vitals.js) will give the next best thing, and also be better than CrUX in some ways as it will give more detailed information, allow you to drill down on data more, and allow you to view the time span you you want, rather than just the 28-days that CrUX uses (though be aware of variability here!).
Lab based tools like Lighthouse and GTMetrix are more fixed, which can be really helpful to get instant feedback under a set conditions, but you need to calibrate any data you get from that with your field data to ensure that's a realistic measure of your typical traffic.
So, while I understand your measures are not yet showing the results you hope to see, you should keep in mind the limitations of what you are measuring here, and that you may well see quite different results from real users. Hopefully positive (a CDN can only help IMHO), but it also might limit the improvements if they are in large part due to redirects.
I don't know what your analytics are showing as to where the majority of your traffic is coming from, and what pages they are landing on, but it might be worth investigating that, to see if you can figure out how much redirects are influencing your TTFB.
The other thing you can do is looking at Server-Timing to try to explain what's happening on your server for these requests (are the majority served from a cache, or created via an expensive database lookup?). Again that depends on the architecture of your back end. You can also consider upping your 3-hour cache limit to try to increase more cache hits on any CDN you add, and on your users browser cache.
One other thing worth considering, and along similar vein to bfcache, is the (very!) newly released prerender option in Chrome, which can help provide "instant" page loads for your users by preparing pages in advance. As I say, that is literally hot off the press, and I wrote about both bfcache and prerender in this post to explain why it can be so powerful - particularly for sites like yours struggling to meet their Core Web Vitals.
Hope that helps explain the discrepancies you're seeing, and give you some more food for thought on possible ways to resolve this. It may seem like this is a lot, for what seems like a fast site when you try it yourself, but the numbers are highlighting what your users are actually experiencing in real life - and that is the entire intent of the Core Web Vitals initiative!
Hi Barry, Thank you so much for your quick and informative reply You have definitely given me a lot of food for thought. My site is written in plain html. I use no database, so it should be very fast. And my users, as well as all test tools I have available, including Web-Vitals with GA-4 and BigQuery , all say that my Web Core Metrics are good. I am disappointed that the time I invested in implementing Web-Vitals with GA-4 and BigQuery was in vain, since it won’t even give me correct TTFB data. It is very hard to optimize for something that I cannot easily measure. I think that Google should base its Core Web Vitals on measurements that can easily and timely be replicated by the user, such as the web-vitals script. But I thank you for your explanations and your desire to help. It is very much appreciated! Hugh www.flixxy.com [Edit:} PS: Looking it over, I think I will go with your suggestion of a "lite YouTube embed" I'll have to wait 28 days to see if it worked though ...
I am disappointed that the time I invested in implementing Web-Vitals with GA-4 and BigQuery was in vain, since it won’t even give me correct TTFB data. It is very hard to optimize for something that I cannot easily measure. I think that Google should base its Core Web Vitals on measurements that can easily and timely be replicated by the user, such as the web-vitals script.
I hear you. I do think you are an unfortunate, extreme case where your entire site is primarily third-party content in YouTube embeds and, yes they are very hard to measure in the browser.
I shared this case with some other colleagues who work on CWV, and one of them commented that "Their issue was impressive in its comprehensiveness and still two major issues fell through the cracks" showing that you did do an impressive amount of work - and in many ways all the right things - but still were unable to use that to identify the real causes 😔 For the vast majority of sites, the things you did would give valuable insights, even if they are not able to exactly measure the full picture in some cases.
On the other hand, I would say the CrUX dataset, although limited in some ways, has highlighted a potential real issue to you here, so at least that tooling helped surface it, and got you thinking about this more. So that was a big part of the intent of the Core Web Vitals initiative.
And on that note, I dug a little more into the CrUX data by running a SQL query on it's BigQuery dataset and came up with following table (data and query here):
These are all the countries where you have enough data to pass the anonymity threshold and it shows a few interesting things:
So, given all that, I'd say you should continue your investigations into Cloudflare as it's pretty likely that will drastically improve your LCP. I don't know how much traffic you get from each of those countries (other than the fact it's enough to meet the threshold to appear in CrUX) but if non-US traffic is a significant proportion then this could have a major impact. And similarly US traffic from further away parts of the US than where your servers are based, could similarly be boosted as that is dragging down your TTFB to 800ms and I would expect to be a lot lower for a static site if it as located near to the user (like a CDN will effectively make it).
Mobile data looks similar, but more extreme (which is fairly typical):
Interestingly India is primarily mobile (we see this regularly), so much so that desktop usage isn't sufficient to register, and Isreal and New Zealand are the opposite and primarily desktop.
[Edit:} PS: Looking it over, I think I will go with your suggestion of a "lite YouTube embed" I'll have to wait 28 days to see if it worked though ...
Cool. Would be interested to hear back how that goes for you!
One tip for you, in the expanded view of Page Speed Insights you can see the breakdown of page views in each category:
Although you won't see the full effect until after 28 days, you should hopefully see the percentage of "good" page views increase each day, and once it crosses 75% the category will go green. I talk about that more in this article.
Dear Barry, Thank you for the explanations. They are very helpful for me to understand this situation better. The SQL query by country was very useful to me. Here is my GA country breakdown over the last 30 days: <html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
United States | 55% -- | -- Canada | 14% Australia | 6% South Africa | 6% United Kingdom | 4% Germany | 3% France | 1% Czechia | 1% Italy | 1% New Zealand | 1% Other | 8%
I am a one-man website owner and only found out about Core Web Vitals a few months ago. I was shocked when I saw that Google considers none of my pages proving a good user experience, so I went to work improving all aspects of my site. While I was able to make all my pages mobile friendly, I had very little success improving LCP and CLS After 2 months of studies, experiments and improvements, my site is all “green” in lab tests and RUM tests (including my own web-vitals sent to GA4 and analyzed with BigQuery), but still in “RED” as far as Google is concerned. I am at the end of my wits now.
Please look at my detailed screenshots below:
Edit: core-web-vitals-discrepancies-images-new.pdf
My most visited (and also worst performing) page is: https://www.flixxy.com/trumpet-solo-melissa-venema.htm
Notes: I understand the differences between Lab test Field tests and RUM. I also looked in detail into the differences between CrUX and RUM (per: https://web.dev/crux-and-rum-differences )
1) CrUX is Chrome only: Even if I filter GA4 for Chrome users only, the differences persist. 2) Opted-in users: This should only account for a small difference. 3) Website must be publicly discoverable: My site is 100% public 4) CrUX segments data by mobile, desktop, and tablet: The differences persist if I segments data by mobile, desktop, and tablet. 5) Sampling size: I am using a sampling size of over 30,000 sessions 6) Timespan: I am analyzing over 28 days of GA data. 7) CrUX metrics are measured at the 75th percentile: Core Web Vitals with GA-4 and BigQuery measures data at 75p. 8) Metrics timing a) LCP: I am using the Google recommended Core Web Vitals javascript. b) "CLS is measured through the life of the page": Visual inspection as well as Chrome Dev Tools show no significant CLS (>0.01). Neither does GA4. How come CrUX sees 0.35 CLS? 9) CrUX does measure metrics within iframes, Core Web Vitals javascript does not :
All my iframes are wrapped with div's defined in width and height and overflow: hidden to avoid any CLS 10) Cross-origin resources: “LCP media served from other domains do not give render time in the PerformanceObserver API—unless the Timing-Allow-Origin header (TAO) is provided.” I noticed this one only just now and have now added a TAO to my .htaccess file.
11) Background tabs and prerender: I do not use Background tabs and prerender.
Final questions:
Hugh
www.flixxy.com