Closed marcoscaceres closed 1 month ago
Thanks for tagging.
@tunetheweb might have the latest big quires of the 2021 insights.
We're in the middle of writing those right now for the 2021 edition. Follow along here: https://github.com/HTTPArchive/almanac.httparchive.org/issues/2153
However we did add a custom metric to the monthly HTTP Archive crawl to make it easier to query this data every month, so we can see what the make up of manifests are for the top 7-8million websites that HTTP Archive tracks monthly. Let me know if there's something specific you're looking for and can maybe help you query it.
We (Edge) just overhauled our own manifest crawler and I have a dashboard showing usage, errors, store readiness, and more. Happy to share the details I find. It would also be great to correlate the data with what y’all are seeing too. Currently, we’re tracking a little over 135k manifests. Some of the highlights:
name
.Per #399, I have been continuing to track query string usage and have not found many UUIDs embedded in the start_url
… yet. The only ones I’ve identified are:
https://www.geekmagazine.com.br/?tracking=5eb941b082681 https://www.aadvantageeshopping.com/?source=mn|AA|ALL|mn|NA|bookmark|na||mobile|00000000 https://shopping.mileageplus.com/?source=mn|UA|ALL|mn|NA|bookmark|na||mobile|00000000 /?tracelog=51606102_9527_7259_7770&from=desktop /?ucid=HPS-1027 /?source=NV_PWA_HL_HP&agentcode=00399206&utm_medium=WebPush&utm_source=NotifyVisitor /?mc=xQZpn7mAMXLZ
I need to investigate these to see if they are using a truly unique identifier though. Most have "utm_source=webapp" or similar. The closest I’ve seen to a tracking value so far is the occasional "utm_campaign=X" or the language or similar. One accidentally (?!) posted an API Key in the start url.
I’ve also got data on newer features too (share_target, protocol_handlers, etc.). I’m planning to write-up a "State of the Manifest" some time in the next quarter.
Amazing stuff @aarongustafson! Thanks for gathering those stats.
Thanks also @tunetheweb! Will follow along.
FYI we've run the stats for this years Web Almanac PWA chapter and the data is available here: https://docs.google.com/spreadsheets/d/16AkIdDBBkCR5Kgb7kyfYvnNLQBu23Vsh7MUSFHW9RtA/#gid=398503119
Let me know if you spot anything that looks odd and can investigate further.
The 2022 data can be found here: https://docs.google.com/spreadsheets/d/1PbzjhN--jU9MGuWobw5L9EsmlVzI9tlbCe3_NKA7giU/edit#gid=2077755325
Closing due to inactivity
It might be good to look over: https://almanac.httparchive.org/en/2020/pwa#web-app-manifests
And see if there is anything to be gleamed from it that might influence things in the spec.
I'm cc'ing @hemanth as they may provide additional insights or just for their interest.