HTTPArchive / almanac.httparchive.org

HTTP Archive's annual "State of the Web" report made by the web community
https://almanac.httparchive.org
Apache License 2.0
611 stars 169 forks source link

Capabilities 2021 #2152

Closed rviscomi closed 2 years ago

rviscomi commented 3 years ago

Part II Chapter 14: Capabilities

Capabilities illustration

If you're interested in contributing to the Capabilities chapter of the 2021 Web Almanac, please reply to this issue and indicate which role or roles best fit your interest and availability: author, reviewer, analyst, and/or editor.

Content team

Lead Authors Reviewers Analysts Editors Coordinator
@christianliebel @christianliebel @tomayac @hemanth @tomayac @tunetheweb @obto
Expand for more information about each role - The **[content team lead](https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Content-Team-Leads'-Guide)** is the chapter owner and responsible for setting the scope of the chapter and managing contributors' day-to-day progress. - **[Authors](https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Authors'-Guide)** are subject matter experts and lead the content direction for each chapter. Chapters typically have one or two authors. Authors are responsible for planning the outline of the chapter, analyzing stats and trends, and writing the annual report. - **[Reviewers](https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Reviewers'-Guide)** are also subject matter experts and assist authors with technical reviews during the planning, analyzing, and writing phases. - **[Analysts](https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Analysts'-Guide)** are responsible for researching the stats and trends used throughout the Almanac. Analysts work closely with authors and reviewers during the planning phase to give direction on the types of stats that are possible from the dataset, and during the analyzing/writing phases to ensure that the stats are used correctly. - **[Editors](https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Editors'-Guide)** are technical writers who have a penchant for both technical and non-technical content correctness. Editors have a mastery of the English language and work closely with authors to help wordsmith content and ensure that everything fits together as a cohesive unit. - The **[section coordinator](https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Section-Leads'-Guide)** is the overall owner for all chapters within a section like "User Experience" or "Page Content" and helps to keep each chapter on schedule. _Note: The time commitment for each role varies by the chapter's scope and complexity as well as the number of contributors._ For an overview of how the roles work together at each phase of the project, see the [Chapter Lifecycle](https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Chapter-Lifecycle) doc.

Milestone checklist

0. Form the content team

1. Plan content

2. Gather data

3. Validate results

4. Draft content

5. Publication

Chapter resources

Refer to these 2021 Capabilities resources throughout the content creation process:

๐Ÿ“„ Google Docs for outlining and drafting content ๐Ÿ” SQL files for committing the queries used during analysis ๐Ÿ“Š Google Sheets for saving the results of queries ๐Ÿ“ Markdown file for publishing content and managing public metadata

christianliebel commented 3 years ago

@rviscomi I am happy to offer myself again as an author.

rviscomi commented 3 years ago

Thank you @christianliebel! I will tentatively add you as a peer reviewer to give a chance for anyone new to author if they're interested. But it was great working with you last year and I'd be happy to have you author again if no one else steps up. More context about the author selection rationale in #2165.

rviscomi commented 3 years ago

@christianliebel thanks for your interest in authoring this chapter! As the content team lead, you'll be responsible for the scope and direction of the chapter and keeping it on schedule. We automatically monitor the staffing and progress of each chapter based on the state of the initial comment so please keep that updated as you add new contributors and meet each milestone.

We've created a Google Doc for this chapter, which you're encouraged to use to collaborate with the content team on the initial outline, metrics, and ultimately the final draft.

Next steps for this chapter are:

@obto will be the section coordinator for this chapter, so they'll be periodically checking in with you directly to make sure the chapter is staying on schedule. Reach out to them here in this issue if you have any questions about the process.

More information about the content team lead and author roles and responsibilities are available for reference in the wiki if needed.

To anyone else interested in contributing to this chapter, please comment below to join the team!

tomayac commented 3 years ago

@christianliebel thanks for your interest in authoring this chapter!

Thanks from my end as well :-)

Sign me up. This time my objective is to see if we can cover some more "behind a user gesture" data.

rviscomi commented 3 years ago

Sign me up. This time my objective is to see if we can cover some more "behind a user gesture" data.

Great to have you back on board @tomayac! ๐ŸŽ‰

Simulating user gestures to capture more realistic data sounds amazing. It's a big challenge that has been limiting us for years, so I'd be curious to hear your ideas. Any changes to the WPT infrastructure would probably need to be tested ahead of the big July crawl. cc @pmeenan FYI

tomayac commented 3 years ago

Simulating user gestures to capture more realistic data sounds amazing. It's a big challenge that has been limiting us for years, so I'd be curious to hear your ideas. Any changes to the WPT infrastructure would probably need to be tested ahead of the big July crawl. cc @pmeenan FYI

To be quite honest I was aiming a little lower by proposing we look more at response bodies and see if we can come up with heuristics to see if something is, say, just an article about an API, or actual code that would be executed if a user gesture happens. So string search paired with some if statements. Or, as they call it, Big Dataโ„ข.

I know (rather: have anecdotal knowledge of) the search bot clicks buttons etc. to see if something changes on the page (for example to see if a select box is used for navigational purposes), but I guess doing something like this in WPT would be harder. But I know very little about WPT, so maybe it's not completely out of reach?!

rviscomi commented 3 years ago

Simulating interactions is doable with WPT but it'd be a first for HTTP Archive and doing it at scale would be challenging. Static analysis to determine "what kind of page is this" also sounds challenging open to lots of subjectivity. If we go that route I'd be interested to see what heuristics would produce reliable signals.

tomayac commented 3 years ago

Simulating interactions is doable with WPT but it'd be a first for HTTP Archive and doing it at scale would be challenging.

That's what I expected, so most probably not a pragmatic way to go.

Static analysis to determine "what kind of page is this" also sounds challenging open to lots of subjectivity. If we go that route I'd be interested to see what heuristics would produce reliable signals.

To-be-determined ๐Ÿงช I guessโ€ฆ In the worst case we simply do an update of the 2020 queries.

rviscomi commented 3 years ago

@christianliebel can you edit the top comment to check off the 0th milestone? (helpful for us to monitor each chapter's progress at a glance in https://github.com/HTTPArchive/almanac.httparchive.org/issues/2179)

foxdavidj commented 3 years ago

@christianliebel @tomayac excited to work with you again this year ๐ŸŽ‰

foxdavidj commented 3 years ago

Couple things:

  1. I've added links within the doc to the previous years Google doc so we can scan it to see if there's any ideas we'd like to revisit.
  2. What do you think about setting up a 30 minute Zoom call in the next couple weeks to kick-start the chapter planning and brainstorming process? I'll reach out again later this week to find a time that works.
christianliebel commented 3 years ago

@obto Sure, we can follow up via email (in the doc) or Twitter group chat (@tomayac and @christianliebel, DMs are open) if you like.

foxdavidj commented 3 years ago

@christianliebel The chapter outline is due by June 15th, which is just over a week from now and it doesn't look like anything has been written as of yet. How can we help?

christianliebel commented 3 years ago

@christianliebel The chapter outline is due by June 15th, which is just over a week from now and it doesn't look like anything has been written as of yet. How can we help?

@obto The outline depends on what data we can get. If we have usage data for APIs that require user action, I would focus on those. The fallback plan is to compare how API usage has changed from last year. Since it looks like we can manage to get data for the new APIs, I would prepare the outline for that scenario. However, we may have to change the plan again later.

christianliebel commented 3 years ago

@demianrenzulli @tunetheweb @webmaxru @Schweinepriester @thepassle @hemanth @tropicadri @andreban @jeffposnick @logicalphase Dear PWA content team, we could need some more reviewers for the Capabilities chapter. As it's thematically related to PWA, I wanted to ask if one of you could step up as an additional reviewer for our chapter. I really appreciate any help you can provide.

hemanth commented 3 years ago

@christianliebel I am closing following fugu, would love to review the capabilities chapter.

christianliebel commented 3 years ago

@christianliebel I am closing following fugu, would love to review the capabilities chapter.

Excellent, thank you so much!

foxdavidj commented 3 years ago

@christianliebel The chapter outline is due by June 15th, which is just over a week from now and it doesn't look like anything has been written as of yet. How can we help?

@obto The outline depends on what data we can get. If we have usage data for APIs that require user action, I would focus on those. The fallback plan is to compare how API usage has changed from last year. Since it looks like we can manage to get data for the new APIs, I would prepare the outline for that scenario. However, we may have to change the plan again later.

Sounds good. Totally understand, and expect, the outline changing over time. Just want to make sure we have everything set up to collect your data before the crawler start July 1.

tomayac commented 3 years ago

Collecting custom metrics for Project Fugu APIs is happening in https://github.com/HTTPArchive/legacy.httparchive.org/pull/208.

tomayac commented 3 years ago

@christianliebel @hemanth: We can query data now ๐Ÿ˜ƒ! Here's an example (SQL courtesy of @rviscomi):

CREATE TEMP FUNCTION getFuguAPIs(data STRING) RETURNS ARRAY<STRING> LANGUAGE js AS '''
const $ = JSON.parse(data);
return Object.keys($);
''';

SELECT
fuguAPI, url, payload
FROM
  `httparchive.pages.2021_07_01_desktop`,
  UNNEST(getFuguAPIs(JSON_QUERY(payload, '$."_fugu-apis"'))) AS fuguAPI
WHERE
  JSON_QUERY(payload, '$."_fugu-apis"') != "[]"
  and fuguAPI = "File System Access"
GROUP BY
  fuguAPI, url , payload
  limit 100

In the query, fuguAPI can be any ofโ€ฆ

WebBluetooth
WebUSB
Web Share
Web Share (Files)
Async Clipboard
Async Clipboard (Images)
Contact Picker
getInstalledRelatedApps
Compression Streams
Periodic Background Sync
Badging
Shape Detection (Barcodes)
Shape Detection (Faces)
Shape Detection (Texts)
Screen Wake Lock
Content Index
Credential Management
WebOTP
File System Access
Pointer Lock (unadjustedMovement)
WebHID
WebSerial
WebNFC
Run On Login
WebCodecs
Digital Goods
Idle Detection
Storage Foundation
Handwriting Recognition
Compute Pressure
Accelerometer
Gyroscope
Absolute Orientation Sensor
Relative Orientation Sensor
Gravity Sensor
Linear Acceleration Sensor
Magnetometer
Ambient Light Sensor
File Handling
Notification Triggers
Local Font Access
Multi-Screen Window Placement
WebSocketStream
WebTransport
Gamepad
WebGPU
Window Controls Overlay
Web Share Target
Web Share Target (Files)
Shortcuts
Declarative Link Capturing
Tabbed Application Mode
URL Handlers
Protocol Handlers
tomayac commented 3 years ago

Example spreadsheet (shared with @hemanth and @christianliebel and the Google Chrome organization.)

foxdavidj commented 3 years ago

@christianliebel @tomayac Looks like you're making steady progress on analyzing the data.

Just wanted to let you know we've got 3 chapters that have their analysis complete (PWA, Mobile Web, Accessibility) that you can refer to if you've ever got a question for how to grab the data you need.

Looking forward to seeing the chapter start to take form and if you've ever got any questions just ping me

PWA: Queries, Results (has all their visualizations done as well) Mobile Web: Queries, Results A11Y: Queries, Results

tomayac commented 3 years ago

I have just added the queries as part of https://github.com/HTTPArchive/almanac.httparchive.org/pull/2322.

christianliebel commented 2 years ago

@tomayac @hemanth The chapter is now ready for review: https://github.com/christianliebel/web-almanac-2021-capabilities

Compared to the draft outline, I have omitted all APIs for which we have no data. I'm looking forward to your review comments!

rviscomi commented 2 years ago

@christianliebel great to see the completed draft! What's the best way to give review feedback? A couple of ideas:

Also, what's your plan for figures and data visualizations?

hemanth commented 2 years ago

@christianliebel Nice! I have dropped few comments on the PR.

christianliebel commented 2 years ago

@rviscomi I created a PR to merge the markdown into the almanac repository here: https://github.com/HTTPArchive/almanac.httparchive.org/pull/2398. I do not yet have a plan for figures or data visualizations. We don't have data to compare from last year, and I don't know if we already have historical data for the new custom metrics. Maybe @tomayac can shed some light on this?

@hemanth Thanks for your comments, I'll have a look at them.

christianliebel commented 2 years ago

@hemanth I hopefully resolved all your comments in #2398.

@rviscomi @tomayac The only visualization I could think of is showing the usage (absolute/relative) of the different APIs on a logarithmic scale.

The frontmatter is now also finalized.

hemanth commented 2 years ago

Yes @christianliebel thank you, it looks great! The only addon as we are looking into would be a few visualizations.

rviscomi commented 2 years ago

@christianliebel have you also considered big numbers?

rviscomi commented 2 years ago

@christianliebel @tomayacย @hemanth

Thank you all for your hard work getting this chapter over the finish line in time for the pre-release! Congratulations on finishing the chapter, and I'm excited to see us launch the rest of the chapters along side it on Wednesday ๐ŸŽ‰

When you get 5 minutes, I'd really appreciate if you could fill out our contributor survey to tell us (the project leads) about your experience. It's super helpful to hear what went well or what could be improved for next time. ๐Ÿ™