HTTPArchive / almanac.httparchive.org

HTTP Archive's annual "State of the Web" report made by the web community
https://almanac.httparchive.org
Apache License 2.0
610 stars 168 forks source link

PWA 2020 #909

Closed foxdavidj closed 3 years ago

foxdavidj commented 4 years ago

Part II Chapter 14: PWA

Content team

Authors Reviewers Analysts Draft Queries Results
@hemanth @thepassle @jadjoubran @pearlbea @gokulkrishh @jaisanth @logicalphase @bazzadp Doc *.sql Sheet

Content team lead: @hemanth

Welcome chapter contributors! You'll be using this issue throughout the chapter lifecycle to coordinate on the content planning, analysis, and writing stages.

The content team is made up of the following contributors:

New contributors: If you're interested in joining the content team for this chapter, just leave a comment below and the content team lead will loop you in.

Note: To ensure that you get notifications when tagged, you must be "watching" this repository.

Milestones

0. Form the content team

1. Plan content

2. Gather data

3. Validate results

4. Draft content

5. Publication

tunetheweb commented 4 years ago

Yes that table (we just called it manifests) which contains the manifests.json and a service_worker table (which contains the servicework JavaScript code) is what I'm waiting to be created from the 2020 dataset (the August crawl data) to be able query this and then give you these stats. Know @rviscomi is already working on it so hopefully in next few days I'll be able to give you all the stats.

ibnesayeed commented 3 years ago

Now that @bazzadp is added as an Analysts, the second task's checkbox should be checked.

tunetheweb commented 3 years ago

Now that @bazzadp is added as an Analysts, the second task's checkbox should be checked.

Done.

tunetheweb commented 3 years ago

@tomayac can you explain the low usage of Periodic Background Sync Register and Periodic Background Sync? Is this expected?

I've also found regular BackgroundSync and BackgroundSyncRegister which has a good bit more usage, but still not massive. Not sure what's the difference?

Row yyyymmdd client id feature num_urls total_urls pct_urls sample_url
2 20200801 desktop 745 BackgroundSync 243 5593642 4.34E-05 https://goalkicksoccer.com/
4 20200801 desktop 1025 BackgroundSyncRegister 232 5593642 4.15E-05 https://www.trivago.com.uy/
1 20200801 desktop 2930 PeriodicBackgroundSync 1 5593642 1.79E-07 https://uhcitp.in/
3 20200801 desktop 2931 PeriodicBackgroundSyncRegister 1 5593642 1.79E-07 https://uhcitp.in/
6 20200801 mobile 745 BackgroundSync 270 6347919 4.25E-05 https://www.iamgujarat.com/
5 20200801 mobile 1025 BackgroundSyncRegister 262 6347919 4.13E-05 http://miui.in/
7 20200801 mobile 2930 PeriodicBackgroundSync 1 6347919 1.58E-07 https://uhcitp.in/
8 20200801 mobile 2931 PeriodicBackgroundSyncRegister 1 6347919 1.58E-07 https://uhcitp.in/
tomayac commented 3 years ago

Maybe @jeffposnick as the author of the web.dev article has more insights into how expected or not these results are, but it's definitely in-line low with what we see on ChromeStatus.

The actual periodicsync events are expected to be low, since a lot of sites probably don't meet the required site engagement thresholds we have put in place, but registrations are independent from this.

jeffposnick commented 3 years ago

Yes, I think it just reflects legitimately low usage. Periodic background sync functionality is only available in PWAs that have been installed, for one thing, and that's a fairly high barrier.

tunetheweb commented 3 years ago

Thanks @tomayac / @jeffposnick

@hemanth @logicalphase (and also @thepassle @jadjoubran @pearlbea @gokulkrishh @jaisanth if interested) I've completed the queries and dumped a first cut of the results in this year's PWA sheet for you to have a look at.

As discussed previously, these are mostly based (stolen!) from @tomayac & @jeffposnick 's hard work last year (so I've kept the tab order in the Sheets roughly in-line with last year's results sheet so you can compare) but I have added a few that I thought might be interesting including:

One important point is that my SQL has NOT been reviewed yet by the other analysts. So consider this an early look before you get the official signed off stats later, in case I'm made lots of errors in it. But looks roughly inline with last year so think they are good.

Still, I think it would be good for you all to dive in, see what you think of the stats, ask questions, and also let me know if there's any more stats you want not covered here. And if you're anyway familiar with SQL then please do like at my queries to see how we got this and/or suggest other stats to get.

Let me know your thoughts.

hemanth commented 3 years ago

Thanks a ton @bazzadp, I see the PR is now merged.

I guess it is time for @hemanth @logicalphase to start working on the draft!

tunetheweb commented 3 years ago

Yup go for it. Let me know if you have any questions or anything else you want me to dig into but hopefully there’s quite a lot of stuff there for you to dig into!

Looking forward to see what you write!

logicalphase commented 3 years ago

Hey all. I've recently had surgery and been on the mend. I just need a few days this week to catch up on some things. But I'm ready to go after.

rviscomi commented 3 years ago

@logicalphase glad to hear you're recovering, and please take as much time as you need.

hemanth commented 3 years ago

@logicalphase Take care!

romaincurutchet commented 3 years ago

Hi, Is there going to be a HTTP Archive table with all the PWA & Service Worker metrics? Thank you.

tunetheweb commented 3 years ago

Not sure what you mean? There’s many metrics pulled across from many of the tables. We don’t in general create new tables with specific queries for a specific subject, but instead share the queries and the results from those queries.

Saying that we did create a few helpers tables of the August data to list all manifests and service worker JavaScript to help with the queries but no plans to create those every month.

romaincurutchet commented 3 years ago

Hey, I was referring to this thread: https://discuss.httparchive.org/t/progressive-web-apps-in-the-http-archive/1401 In particular, I like the metrics under the section "Service Workers Analysis".

tunetheweb commented 3 years ago

Yes that is basically the methodology we are following for this years chapter (Thomas wrote last years chapter and the queries he created for that are being reused for this years chapter with a few more).

We are currently analysing the results of this data now and will publish our thoughts later in the year. If curious you can see the SQL used and the results sheet of those queries from the links at the top of this issue.

tunetheweb commented 3 years ago

Hey @hemanth / @logicalphase did you get a chance to look over the stats yet? Do let us know if you think that's enough info to write the chapter or if there are any other stats you think you'll need and I can look to see if possible.

P.S. Hope you're recovering from your surgery @logicalphase and don't feel pressured to reply if still dealing with that - your health is more important!

hemanth commented 3 years ago

Thanks @bazzadp!

I went through the PWA Sheet and it as almost all the information required for the chapters under our radar.

Also, from our pervious discussions the metrics on BackgroundSync is as per the expectations right?

tunetheweb commented 3 years ago

Also, from our pervious discussions the metrics on BackgroundSync is as per the expectations right?

Yes it appears to be. If you are aware of any examples in the wild using this then feel free to ping me and can see if it’s in the data, but for now, it doesn’t appear to be used much at all! And particularly the periodic versions.

foxdavidj commented 3 years ago

@hemanth in case you missed it, we've adjusted the milestones to push the launch date back from November 9 to December 9. This gives all chapters exactly 7 weeks from now to wrap up the analysis, write a draft, get it reviewed, and submit it for publication. So the next milestone will be to complete the first draft by November 12.

However if you're still on schedule to be done by the original November 9 launch date we want you to know that this change doesn't mean your hard work was wasted, and that you'll get the privilege of being part of our "Early Access" launch.

Please see the link above for more info and reach out to @rviscomi or me if you have any questions or concerns about the timeline. We hope this change gives you a bit more breathing room to finish the chapter comfortably and we're excited to see it go live!

hemanth commented 3 years ago

Yes @obto

That's great news, sorry was AFK for couple of days.

Will restart pawing at this and eager to see this go live too!

hemanth commented 3 years ago

@logicalphase @thepassle @jadjoubran @pearlbea @gokulkrishh @jaisanth

We should meet and have a quick discussion on the few of the steps we need to take to reach the finish line sooner. I understand that we are in different timezones and it is hard to find the best time...but let me propose 7.30PM PST 11/08/2020, hope that sounds like a plan.

hemanth commented 3 years ago

@logicalphase @thepassle @jadjoubran @pearlbea @gokulkrishh @jaisanth

The draft is ready for review please have a look and comment wherever it makes sense, have a look at the charts and let us know if they sound good or require additions or deletions.

Shoutouts to @bazzadp for fine tuning the graphs (lot of graphs!) 🙏

tunetheweb commented 3 years ago

@thepassle @jadjoubran @pearlbea @gokulkrishh @jaisanth @logicalphase any further comments on @hemanth 's draft?

Would be good to move this forward into converting this to Markdown but we should make sure you've all reviewed and fed back any commands before then as easier to manage the chapter in Google Docs initially.

thepassle commented 3 years ago

Sorry, I missed this. I'll take a look and dive in tomorrow 🙂

gokulkrishh commented 3 years ago

@hemanth @bazzadp Added a few suggestions via comments in the doc. Feel free to reject it if you feel it is not accurate. 💯 Awesome work on the content.

Excited for the full report of web almanac 2020 and CDS 🤗.

thepassle commented 3 years ago

Just went through the draft and left some comments 👍 Nice work so far

tunetheweb commented 3 years ago

@hemanth did you get a chance to look over and address the feedback? We need to start converting this ti Markdown is we want to make the launch date in one week's time.

hemanth commented 3 years ago

@bazzadp I have addressed all the feedback comments, there is only one graph that's pending with more information, post which we must be good to convert it to markdown. 👍

rviscomi commented 3 years ago

@hemanth that's great! I'd recommend that you start on the markdown conversion now and leave a placeholder for the outstanding graph, for example you can use a Jinja comment:

{# TODO(analysts, authors): Add graph for the XYZ metric. #}

That way we can review the markdown in a PR while the data viz is pending. That should help keep this chapter on schedule while we wait.

hemanth commented 3 years ago

@rviscomi Are we using any specific tool or any text to md CLI tool do?

rviscomi commented 3 years ago

Whatever is easiest for you. Personally, I'll copy the plain text from the doc, paste it into a text editor, and manually add the markdown syntax. @bazzadp has also had some useful advice about the process here and here.

hemanth commented 3 years ago

Had that as the second that, yeah, sounds like we are better off doing it manually, will start working on it.

rviscomi commented 3 years ago

Sounds great, thanks for working on it

hemanth commented 3 years ago

1613 is ready with MD.

//cc @rviscomi @bazzadp