Closed foxdavidj closed 4 years ago
If we really needed to scrape the JavaScript responses, maybe this would help: https://timkadlec.com/remembers/2020-04-16-webpagetest-custom-metrics-with-request-data/
Edit: Actually, @Tiggerito isn't this almost exactly the use case you wanted for SEO? (comparing initial SSR HTML state with dynamic CSR state)
The second custom metric, num-total-scripts, should look familiar—that grabs how many script elements appear in the final version of the document, after the page has been loaded and all JavaScript has run.
The first custom metric, num-initial-scripts, counts the number of script elements in the initial HTML. First it grabs the body of the first response using the $WPT_BODIES placeholder. Since that returns a string, we then convert it to HTML so we can parse it more easily. Finally, once we have HTML, we query it as we did the original document.
[num-initial-scripts]
let html = $WPT_BODIES[0].response_body;
let wrapper = document.createElement('div');
wrapper.innerHTML = html;
return wrapper.querySelectorAll('script').length;
[num-total-scripts]
return document.querySelectorAll('script').length;
@obto this looks amazing! Really nice work!
I'll do a thorough code review soon, but for now could you test this on a bunch of websites and share the WPT results? See https://github.com/HTTPArchive/legacy.httparchive.org/pull/170#issuecomment-661658511 for example.
FYI @HTTPArchive/analysts: this mega PR may handle many of the custom metrics needed for your chapters.
If we really needed to scrape the JavaScript responses, maybe this would help: https://timkadlec.com/remembers/2020-04-16-webpagetest-custom-metrics-with-request-data/
Edit: Actually, @Tiggerito isn't this almost exactly the use case you wanted for SEO? (comparing initial SSR HTML state with dynamic CSR state)
The second custom metric, num-total-scripts, should look familiar—that grabs how many script elements appear in the final version of the document, after the page has been loaded and all JavaScript has run. The first custom metric, num-initial-scripts, counts the number of script elements in the initial HTML. First it grabs the body of the first response using the $WPT_BODIES placeholder. Since that returns a string, we then convert it to HTML so we can parse it more easily. Finally, once we have HTML, we query it as we did the original document.
[num-initial-scripts] let html = $WPT_BODIES[0].response_body; let wrapper = document.createElement('div'); wrapper.innerHTML = html; return wrapper.querySelectorAll('script').length; [num-total-scripts] return document.querySelectorAll('script').length;
$WPT_BODIES could be a great solution. I've just done something similar to get http headers. So much easier to analyse everything in one place. 😎
@obto @rviscomi I've already manually reviewed and updated our scripts for 2020 in a draft pull request. And with the $WPT_BODIES idea I think I can get rid of the last evil one 😁
Sorry @obto. Just realised this is for the almanac.js file. I've also a draft pull request for that which has quite a bit of overlap.
How should I go about pulling things together?
Just a quick word of caution about WPT_REQUESTS and WPT_BODIES. I'm not sure how they will do at HA-scale on some of the more extreme pages (5000+ requests, 50+MB of uncompressed scripts, etc). If possible I'd recommend finding a few of the extreme sites and testing the custom metrics just to make sure it doesn't accidentally break the test.
If multiple scripts are together in one custom metric, it would be worthwhile to have a single global instance of WPT_BODIES (and no WPT_REQUESTS since bodies includes the requests) to make sure we're not multiplying the impact out.
The bodies/requests are serialized to JSON and inserted into the custom metric. The custom metric is executed by sending the script over the websocket dev tools interface where it needs to be evaluated and executed. I'm not sure how well it will handle hundreds of MB of script.
@Tiggerito I've tried to modularize a lot of the data we're collecting and make it multi-purpose instead of specific per chapter (e.g., see the 'images' function in the PR).
After taking a quick scan of yours, it looks like there's a good deal of overlap because of this. Think it might be easiest if I pull in your changes to this PR and have you review it. What do you think?
Or alternatively, we could review this PR and merge it. Then I can open up another PR to add the missing data that you've added in your PR.
What do you think @rviscomi? Merge this PR then bring in the additional SEO changes? Or get it all done at once?
Sorry about all this @Tiggerito. Hope I'm not stepping on your toes too much here :sweat_drops:
What do you think @rviscomi? Merge this PR then bring in the additional SEO changes? Or get it all done at once?
I'll defer to whatever workflow is easiest for you and @Tiggerito to resolve the impending merge conflicts
If multiple scripts are together in one custom metric, it would be worthwhile to have a single global instance of WPT_BODIES (and no WPT_REQUESTS since bodies includes the requests) to make sure we're not multiplying the impact out.
That's a good point. So:
almanac.js
, so that they have no chance of interfering with the rest of the data collection (barring catastrophic test failures)If a page has so much data that it makes $WPT_BODIES totally infeasible, hopefully we can still collect metrics in almanac.js
. Even if not, those few cases would be outliers.
What do you think @rviscomi? Merge this PR then bring in the additional SEO changes? Or get it all done at once?
I'll defer to whatever workflow is easiest for you and @Tiggerito to resolve the impending merge conflicts
I presume the style used by @obto is more correct than mine and his is closer to being merged. I'd say merge that in now and I can rework mine based on the new version. I presume that means I create a brand new fork and edit that (I'm new to this)?
While I wait for that merge I can continue adding features and testing. It looks like I have permission to use $WPT_BODIES which is great.
With the changes @obto made, I presume we are free to refactor the script in areas used by other chapters. Some SEO scripts were used by others last year so I deliberately did not change their structure.
@Tiggerito Sounds good. So i can help you out as much as possible, can you make a bullet list of all the stats you were writing custom metrics for?
I can show you which metrics already exist in mine and help you write the rest then :)
@Tiggerito Sounds good. So i can help you out as much as possible, can you make a bullet list of all the stats you were writing custom metrics for?
I can show you which metrics already exist in mine and help you write the rest then :)
My pull request comment should be up to date with the metrics and what I need to still do.
I'll continue working on my version and evolve it towards your structure so it should be easier to combine them.
In the end is it easiest for me to create a brand new fork from your merged changes and re-apply my code to that? I'm guessing there is no way to update my fork to the latest main?
I'm also considering doing a smaller pull request to make the script more robust. Adding a way to log errors so in the future we can query the almanac data to see why things broke. At the moment an error causes almanac = null !
@Tiggerito GitHub tip: use "Start a review" to batch your comments (and their subsequent email notifications) into one 😅
@Tiggerito GitHub tip: use "Start a review" to batch your comments (and their subsequent email notifications) into one 😅
Should I have done the start a review on the first comment then? 🤭
Yeah, but no worries. This is great feedback, I'll get to making the changes.
Can we manually fix things on a merge or should we continue with the idea of me re-applying my changes after this is merged?
What i'd recommend is once my PR is merged into master, you can rebase
your branch on top of the new master. Which will essentially replay all of your changes on top of mine :+1:
One additional change i may make is editing '10.5'
to extract all the json-ld schema data and rename the metric accordingly
Yeah, but no worries. This is great feedback, I'll get to making the changes.
Can we manually fix things on a merge or should we continue with the idea of me re-applying my changes after this is merged?
What i'd recommend is once my PR is merged into master, you can
rebase
your branch on top of the new master. Which will essentially replay all of your changes on top of mine 👍
Cool. I get to try another new thing rebasing.
One additional change i may make is editing
'10.5'
to extract all the json-ld schema data and rename the metric accordingly
I completely rewrite that one 🤯
$WPT_BODIES worked really well. I now have data from the headers, raw html and the rendered html. I think I've implemented everything I can. Now testing.
I feel, we should be pretty good with these metrics!
Thank you @obto! 🙏
All of @Tiggerito's feedback has been resolved. Thanks!!
I've tested the script on 13 large sites, all completed successfully, and i documented the results here: https://docs.google.com/spreadsheets/d/1FliIat7uXauPVCpWcl1GU-InAOWXW7e7HHJuv4c7URQ/edit?usp=sharing
The last order of business we have is deciding if we're alright with increasing the size of the JSON file by exporting many more nodes (img, and a11y nodes particularly).
After analyzing the results, most generate around 35KB of data, and the resulting JSON file is still 10-30% the size of the firstHtml file. But as you can see from the results... some large sites like Amazon, who have tons of images and a11y nodes, make quite a large file (171kb).
These larger files will make querying more expensive, but the breadth and depth of what we can query without needing more custom metrics is fantastic.
I think the trade-off is worth it, but what do you think @rviscomi?
All of @Tiggerito's feedback has been resolved. Thanks!!
I've tested the script on 13 large sites, all completed successfully, and i documented the results here: https://docs.google.com/spreadsheets/d/1FliIat7uXauPVCpWcl1GU-InAOWXW7e7HHJuv4c7URQ/edit?usp=sharing
The last order of business we have is deciding if we're alright with increasing the size of the JSON file by exporting many more nodes (img, and a11y nodes particularly).
After analyzing the results, most generate around 35KB of data, and the resulting JSON file is still 10-30% the size of the firstHtml file. But as you can see from the results... some large sites like Amazon, who have tons of images and a11y nodes, make quite a large file (171kb).
These larger files will make querying more expensive, but the breadth and depth of what we can query without needing more custom metrics is fantastic.
I think the trade-off is worth it, but what do you think @rviscomi?
I was thinking about this the other day. The almanac property seems to be quite a small part of the payload. Is it possible to trim the payload down to just what is used?
The last order of business we have is deciding if we're alright with increasing the size of the JSON file by exporting many more nodes (img, and a11y nodes particularly).
After analyzing the results, most generate around 35KB of data, and the resulting JSON file is still 10-30% the size of the firstHtml file. But as you can see from the results... some large sites like Amazon, who have tons of images and a11y nodes, make quite a large file (171kb).
These larger files will make querying more expensive, but the breadth and depth of what we can query without needing more custom metrics is fantastic.
I think the trade-off is worth it, but what do you think @rviscomi?
Yeah the size increase is concerning. Beyond these few popular sites, the edge cases are where we're going to get hit the hardest. Looking at the 2019 Markup chapter results, the biggest page has over a million elements on it. I worry about how pages like that would affect the test durability and storage/query costs.
The last order of business we have is deciding if we're alright with increasing the size of the JSON file by exporting many more nodes (img, and a11y nodes particularly). After analyzing the results, most generate around 35KB of data, and the resulting JSON file is still 10-30% the size of the firstHtml file. But as you can see from the results... some large sites like Amazon, who have tons of images and a11y nodes, make quite a large file (171kb). These larger files will make querying more expensive, but the breadth and depth of what we can query without needing more custom metrics is fantastic. I think the trade-off is worth it, but what do you think @rviscomi?
Yeah the size increase is concerning. Beyond these few popular sites, the edge cases are where we're going to get hit the hardest. Looking at the 2019 Markup chapter results, the biggest page has over a million elements on it. I worry about how pages like that would affect the test durability and storage/query costs.
We could include array/data size limits to try and protect us.
e.g. I only output the first heading and crop it at 100 characters. There have been cases where the heading wraps around the whole pages content!
@obto - Looks like after all, I will need custom metric for App links. I think following should do. Are you able to add this in your PR? @rviscomi - I am not sure if you ended up pursuing the idea of using 'fetch' in custom metric for other things as per our slack discussion but this seems to be working as far as I can see.
[AndroidAppLinks]
return fetch('/.well-known/assetlinks.json').then(function(r) {
if(!r.redirected && r.status === 200) {
return "true";
}else{
return "false";
};
});
[iOSUniveralLinks]
return fetch('/.well-known/apple-app-site-association').then(function(r) {
if(!r.redirected && r.status === 200) {
return "true";
}else{
return "false";
};
});
I have tested this on few sites. Status code of 200 (without any re-directs) will mean site is using App deep links. Any other status will suggest, site is not using App deep links.
Site | Scenario | WebPageTest Link |
---|---|---|
https://direct.asda.com/ | Has AndroidAppLinks but no iOSUniversalLinks | https://www.webpagetest.org/custom_metrics.php?test=200726_JB_7502ce8407c36d69d48a9013b81a57d4&run=1&cached=0 |
https://www.myntra.com | Has both AndroidAppLinks and iOSUniversalLinks | https://www.webpagetest.org/custom_metrics.php?test=200726_KV_049c75301d9ecc4a06c374870436e706&run=1&cached=0 |
https://www.johnlewis.com | Both AndroidAppLinks and iOSUniversalLinks not present | https://www.webpagetest.org/custom_metrics.php?test=200726_WJ_0f97e04403ce27ee312f1993ce33b087&run=1&cached=0 |
https://www.burberry.com/ | Both AndroidAppLinks and iOSUniversalLinks not present and this site does re-direct | https://www.webpagetest.org/custom_metrics.php?test=200726_NN_53f746a80422d95c2fde58f906cde5fa&run=1&cached=0 |
@obto - Looks like after all, I will need custom metric for App links. I think following should do. Are you able to add this in your PR? @rviscomi - I am not sure if you ended up pursuing the idea of using 'fetch' in custom metric for other things as per our slack discussion but this seems to be working as far as I can see.
Oooo. The SEO chapter would love to be able to test the robots.txt file. Would I be allowed to do that?
@obto - Looks like after all, I will need custom metric for App links. I think following should do. Are you able to add this in your PR? @rviscomi - I am not sure if you ended up pursuing the idea of using 'fetch' in custom metric for other things as per our slack discussion but this seems to be working as far as I can see.
Oooo. The SEO chapter would love to be able to test the robots.txt file. Would I be allowed to do that?
I've just submitted a new custom metrics to my pull request: robots_txt.js. It uses the fetch idea to gather some basic data from the robots.txt file for a domain. Would be nice if that can be included.
I checked a few of the "Metrics unable to create a function for".
As my pull request will add access to the raw html we can do the text searches and regexes that these queries need. It looks like just a bunch of patterns we would need to flag as true or false.
I could add them to my pull request?
Main change here is the addition of filtering options for parseNodes
and parseNode
. The code documents the possibilities and should be self-explanatory.
Besides that there are 2 other changes:
track
elements are now being tracked in the video
metricnodes_with_aria
is removed. It has too high of a chance of bloating, and doesn't provide much value since a11y's queries are mostly positional focused (what element was this one wrapped inside of, etc...)After testing on Amazon again, the JSON size is 30.3KB
instead of 171KB
Ready for review @rviscomi
@rviscomi Which comments? I should have resolved, or left comments on all of them.
I haven't retested the script on all of the sites again. I can do that when I get up tomorrow
@rviscomi Which comments? I should have resolved, or left comments on all of them.
Scrolling through https://github.com/HTTPArchive/legacy.httparchive.org/pull/172/files, there are about half a dozen unresolved comments. Please close out anything that doesn't need to be addressed.
Huh, i never saw those while looking through before. I should be able to quickly resolve all of these
Updated results: https://docs.google.com/spreadsheets/d/1FliIat7uXauPVCpWcl1GU-InAOWXW7e7HHJuv4c7URQ/edit?usp=sharing
The median JSON size has been reduced by half, and the Avg size is 3x smaller
@rviscomi Removed the overlapping function. Think we're good to go?
LGTM! 🚀
I've looked through all of the metrics from 2019 and created new functions to minimize the use of regex this year.
Here are the summaries:
Metric to Function mapping
Metrics unable to create a function for
Metrics which were moved
has_picture_img
is nowimages.total_pictures_with_img
. Instead of reporting a boolean it reports how many occurrences there were