HTTPArchive / almanac.httparchive.org

HTTP Archive's annual "State of the Web" report made by the web community
https://almanac.httparchive.org
Apache License 2.0
610 stars 168 forks source link

CSS 2021 #2140

Closed rviscomi closed 2 years ago

rviscomi commented 3 years ago

Part I Chapter 1: CSS

CSS illustration

If you're interested in contributing to the CSS chapter of the 2021 Web Almanac, please reply to this issue and indicate which role or roles best fit your interest and availability: author, reviewer, analyst, and/or editor.

Content team

Lead Authors Reviewers Analysts Editors Coordinator
@meyerweb @geekboysupreme @meyerweb @j9t @svgeesus @argyleink @una @estelle @LeaVerou @rachelandrew @jabranr @tomhodgins @rviscomi @thecraftysoul - @rviscomi
Expand for more information about each role - The **[content team lead](https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Content-Team-Leads'-Guide)** is the chapter owner and responsible for setting the scope of the chapter and managing contributors' day-to-day progress. - **[Authors](https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Authors'-Guide)** are subject matter experts and lead the content direction for each chapter. Chapters typically have one or two authors. Authors are responsible for planning the outline of the chapter, analyzing stats and trends, and writing the annual report. - **[Reviewers](https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Reviewers'-Guide)** are also subject matter experts and assist authors with technical reviews during the planning, analyzing, and writing phases. - **[Analysts](https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Analysts'-Guide)** are responsible for researching the stats and trends used throughout the Almanac. Analysts work closely with authors and reviewers during the planning phase to give direction on the types of stats that are possible from the dataset, and during the analyzing/writing phases to ensure that the stats are used correctly. - **[Editors](https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Editors'-Guide)** are technical writers who have a penchant for both technical and non-technical content correctness. Editors have a mastery of the English language and work closely with authors to help wordsmith content and ensure that everything fits together as a cohesive unit. - The **[section coordinator](https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Section-Leads'-Guide)** is the overall owner for all chapters within a section like "User Experience" or "Page Content" and helps to keep each chapter on schedule. _Note: The time commitment for each role varies by the chapter's scope and complexity as well as the number of contributors._ For an overview of how the roles work together at each phase of the project, see the [Chapter Lifecycle](https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Chapter-Lifecycle) doc.

Milestone checklist

0. Form the content team

1. Plan content

2. Gather data

3. Validate results

4. Draft content

5. Publication

Chapter resources

Refer to these 2021 CSS resources throughout the content creation process:

📄 Google Docs for outlining and drafting content 🔍 SQL files for committing the queries used during analysis 📊 Google Sheets for saving the results of queries 📝 Markdown file for publishing content and managing public metadata

meyerweb commented 2 years ago

@rviscomi: On the subject of the custom property names, I’m dubious of some of the affiliation assignments. As an example, all color-like properties (--white, --black, --red, etc.) are credited to Bootstrap. Bootstrap may have those names baked in, but if there’s one class of custom property names I can imagine people inventing on their own, it’s names like those. Do the queries already check in such cases to make sure there are other, more obviously Bootstrap-ish custom properties (or other things, like a Bootstrap license statement) to be more certain of provenance? If not, could they?

(Also, shout out to WordPress for hand-namespacing all their custom properties. If only every project were so circumspect!)

rviscomi commented 2 years ago

There’s a huge discrepancy between this year’s custom property names analysis and last year’s. Last year, Avada was on top with nearly 35% of all custom names. This year, it got 3.7%. Both analyses are (or at least appear to be) based on mobile. It feels like either Avada didn’t get the credit it should this year, or got way too much last year.

The custom property name sheet only shows the top 1000 most popular values as a percentage of pages. It seems like this year there are many mobile pages with Avada that fell just below that threshold.

For example, about 50k desktop pages (0.8%) include Avada custom properties like --color_*. Since there are more than 1M fewer desktop pages than mobile, this makes up a larger percentage of all desktop pages. For comparison, 50k is only about 0.67% of all mobile pages, which is right at the 1000 cutoff.

This methodology doesn't really lend itself well to how the metric is being used in practice. We're manually annotating each custom property name with its corresponding tool (Avada, WordPress, Bootstrap, etc) so it would be impractical to scale this out beyond 1000 names. I think a better methodology we should explore for next year would be to automate that annotation logic in the SQL and remove the LIMIT 1000 so that we can aggregate tool-level stats over all custom properties.

For this year, we could use the more comparable desktop data and/or include a note in the chapter explaining how our approach affected the results.

Do the queries already check in such cases to make sure there are other, more obviously Bootstrap-ish custom properties (or other things, like a Bootstrap license statement) to be more certain of provenance? If not, could they?

This is partially answered above. Because we're manually annotating each custom property name in Sheets, we don't have the full context of what other properties are used on a given page. If we implement the annotations in SQL we'd have more control to implement a solution like the one you suggested.

meyerweb commented 2 years ago

The custom property name sheet only shows the top 1000 most popular values as a percentage of pages. It seems like this year there are many mobile pages with Avada that fell just below that threshold.

For example, about 50k desktop pages (0.8%) include Avada custom properties like --color_*. Since there are more than 1M fewer desktop pages than mobile, this makes up a larger percentage of all desktop pages. For comparison, 50k is only about 0.67% of all mobile pages, which is right at the 1000 cutoff.

This methodology doesn't really lend itself well to how the metric is being used in practice. We're manually annotating each custom property name with its corresponding tool (Avada, WordPress, Bootstrap, etc) so it would be impractical to scale this out beyond 1000 names. I think a better methodology we should explore for next year would be to automate that annotation logic in the SQL and remove the LIMIT 1000 so that we can aggregate tool-level stats over all custom properties.

For this year, we could use the more comparable desktop data and/or include a note in the chapter explaining how our approach affected the results.

Oof. I am fairly strongly motivated to drop this analysis this year (possibly with a note as to why) and come back to it in 2022, assuming more robust analysis is created. The only alternative I see is what you propose, switching to desktop and adding a note explaining the uncertainties in the methodology, and that is itself an argument for just taking the analysis out entirely.

I would be happy to take a part in crafting that more robust analysis, to be clear. I’m just not sure it can be put together in time for this year’s Almanac, given my delay in getting started and the likely complexity of the task.

meyerweb commented 2 years ago

General progress update: I got up to “CSS Mistakes” today, which means I’ll need to write that and the “Sass” section tomorrow, then start back at beginning on resolving suggestions, making my own edits, tackling anything missing, etc. I’d hoped to finish primary drafting today, but pushing any further at this point would probably result in garbage I’d just have to replace tomorrow anyway.

So, reviewers: it’s time to really get started. Please make minor edits as suggestions (switch your edit mode in Google Docs to “Suggesting” in the top right of the Doc), and anything major can be either a comment on the Google Doc or here in the issue. I’d prefer comments in the Doc, but if the comment feels too huge for a sidebar comment, this is a reasonable alternative.

@GeekBoySupreme, given that I’ve gotten this far, how about you write the intro to each section as you go through the chapter? I marked missing intros with orange and the text “TK intro”. It would be a big help if you could do that in addition to editorial review.

bkardell commented 2 years ago

manually annotating each custom property name in Sheets, we don't have the full context of what other properties are used on a given page. If we implement the annotations in SQL we'd have more control to implement a solution like the one you suggested.

I kind of agree with eric's thought to drop or at least revise this part significantly. It might be nice to say something bland but interesting like about custom elements "X number of custom properties appear on over N pages" and then explain something like "here are the ones we feel really confident about" (--wp-* is pretty good/specific) but also note that there's a bunch that might or might not be other systems, and we haven't figured out a good way to count them yet.

LeaVerou commented 2 years ago

@rviscomi @LeaVerou There’s a huge discrepancy between this year’s custom property names analysis and last year’s. Last year, Avada was on top with nearly 35% of all custom names. This year, it got 3.7%. Both analyses are (or at least appear to be) based on mobile. It feels like either Avada didn’t get the credit it should this year, or got way too much last year.

A glance at the pivot table this year (see docs.google.com/spreadsheets/d/12vQIA0xsC5Jr3J9Sh03AcAvgFjMAmP1xSS6Tjai9LF0/edit#gid=725813203) makes me think the affiliation may have gone haywire one way or the other: Avada gets 31.48% of desktop, and 3.67% of mobile. There is a smaller but still substantial discrepancy with Bootstrap between desktop and mobile.

FYI: I filled in all the unaffiliated custom properties with “Miscellaneous” in this year’s sheet, so the pivot table and thus the pie chart wouldn’t have a (blank) entry. if those end up being overwritten by a new analysis run, no worries. Let me know what you find, or what I can do to help dig into this.

What was the methodology this year for assigning custom properties to software? I think last year I did it manually by searching on Github. But I see Avada still has 31.48% on desktop, are you sure last year's analysis was based on mobile?

LeaVerou commented 2 years ago

@rviscomi: On the subject of the custom property names, I’m dubious of some of the affiliation assignments. As an example, all color-like properties (--white, --black, --red, etc.) are credited to Bootstrap. Bootstrap may have those names baked in, but if there’s one class of custom property names I can imagine people inventing on their own, it’s names like those. Do the queries already check in such cases to make sure there are other, more obviously Bootstrap-ish custom properties (or other things, like a Bootstrap license statement) to be more certain of provenance? If not, could they?

(Also, shout out to WordPress for hand-namespacing all their custom properties. If only every project were so circumspect!)

Note that Bootstrap these days does namespace its custom properties, these ones are from old Bootstrap. I think last year when I assigned them manually, I did take into account which groups of properties had similar usage percentages.

rviscomi commented 2 years ago

I kind of agree with eric's thought to drop or at least revise this part significantly. It might be nice to say something bland but interesting like about custom elements "X number of custom properties appear on over N pages"

+1 to dropping the software grouping if it's problematic this year.

We could still show ungrouped stats for individual custom properties if interesting, for example the top 10 custom property names:

name desktop mobile
--wp--style--color--link 18.58% 18.13%
--wp-admin-theme-color 7.36% 7.52%
--red 7.40% 7.23%
--blue 7.39% 7.21%
--green 7.37% 7.19%
--dark 7.31% 7.15%
--white 7.37% 7.07%
--primary 7.20% 6.93%
--secondary 7.14% 6.86%
--gray-dark 7.06% 6.82%

What was the methodology this year for assigning custom properties to software?

@LeaVerou I tried to manually reproduce your pattern matching from 2020 on the 2021 data.

meyerweb commented 2 years ago

We could still show ungrouped stats for individual custom properties if interesting, for example the top 10 custom property names:

That’s it, that’s the play. Thanks, Rick!

meyerweb commented 2 years ago

@rviscomi: New question: the Layout Methods tab returns a figure for Grid of 37% of pages. But the next tab, Flexbox and Grid, says Grid is used on 8% of pages. I feel like I must be misunderstanding one tab or the other, but I’m not sure how to tell or how to clarify them so that I understand what each is saying. Can you clarify, please?

rviscomi commented 2 years ago

@meyerweb see this comment thread

meyerweb commented 2 years ago

@j9t @svgeesus @argyleink @una @estelle @LeaVerou @rachelandrew @jabranr @tomhodgins @GeekBoySupreme @bkardell It’s time! Please review the now-complete draft at your earliest convenience. Thank you!

As mentioned before, it’s preferred to leave comments on the Google Doc with questions/comments/concerns, unless what you need to say is too long for a Google Doc sidenote. In that case, please comment on this issue and leave a comment in the Google Doc with the URL of your GitHub comment.

GeekBoySupreme commented 2 years ago

@rviscomi I pulled the data on methodology from last year's almanac and could you help modifying the numbers to fit with what we did for this year The data in this chapter took 121 SQL queries to produce, totaling over 10K lines of SQL including 3K lines of JavaScript functions within the SQL. This makes it the largest chapter in the Web Almanac’s history. and A lot of engineering work went into making this scale of analysis feasible. Like last year, we put all CSS code through a CSS parser, and stored the Abstract Syntax Trees (AST) for all stylesheets in the corpus, resulting in 10 TB

j9t commented 2 years ago

Please review the now-complete draft at your earliest convenience. Thank you!

@meyerweb, on it; what timeframe can you grant us? (I’m pretty busy but would like to contribute, in time for it to be useful.)

meyerweb commented 2 years ago

I believe we’re supposed to be done with editing by this Sunday (@rviscomi?), so by Thursday of this week would be ideal. Or, if I’m wrong about the deadline or we can get a slight extension, by the end of the weekend.

rviscomi commented 2 years ago

Getting all reviewer feedback in by Thursday SGTM. If taking an extra day or two means having a more polished chapter, then that's a tradeoff I'd be willing to take.

Just to clarify, after the authors write the chapter and reviewers' technical feedback is incorporated, there's still a separate non-technical editing pass. We'll add all of the written-but-unedited chapters into a queue and someone from the Editors team will open a new PR with suggested edits to the markdown. If we had a bigger Editors team we would have assigned one of them to the chapter and streamlined their feedback directly in the doc.

j9t commented 2 years ago

Thanks for clarifying. I might only be able to provide feedback this weekend, but I’ll see what I can do.

svgeesus commented 2 years ago

I'm working my way through the doc.

One area that we didn't examine last year (mainly because of overlap with the Fonts chapter, so we ended up dropping all the font analysis - although it then ended up unexamined in both CSS and Fonts chapters) was the usage of font-* properties.

This year, it is at least briefly examined but the lack of usage isn't really commented on at all

(I also think there is some mis-characterisation regarding font-variant and font-stretch in the doc at present, which I commented on there).

I think it would be worth looking at that in a bit more detail because there is a good takeaway here for the readers this year. They are missing out on properties which are widely implemented, but simply not being used in the wild.

As an example the most common font-variant longhand, font-variant-ligatures is used on only 2.06% of mobile sites. But why? it is widely implemented, everything but IE11 and, like all the font-variant-* longhands, it has a great fallback story if unsupported.

Similarly for font-variant-numeric at 1.75% and again, implemented everywhere.

font-variant-caps is next at 0.69% on mobile, likely because people are using the shorthand to get the most common value.

And then font-variant-east-asian at 0.18% on mobile (maybe next year a per-country analysis would be more informative for that property, given the predominance of European languages on the Web at present)

Is it lack of typographic interest or awareness? Is it excessive caution on using these OpenType features, if the webfont doesn't load? Or is it the need to support older browsers (older than 2016-16), when this support mainly rolled out?

Usage at a couple of percent is common enough that this isn't some brand-new feature at 0.001%, but there are also literally no downsides to using these properties so I think we could make a bit of a story about that aspect.

In terms of visualizations, a separate chart with font-variant and font-variant-* would likely be helpful. And perhaps, a graphic showing what these OpenType features can do.

svgeesus commented 2 years ago

Okay I have gone through the whole doc now and added comments as needed.

j9t commented 2 years ago

Dto., I’m done, too. Hard to find something, Eric, team 🙏 I look forward to this being published.

meyerweb commented 2 years ago

And it’s off to editing! Thank you, everyone. And if I overlooked or misidentified anyone in the contributors listing (see https://20211115t173622-dot-webalmanac.uk.r.appspot.com/en/2021/css), please accept an abject apology and an offer to reinstate you forthwith.

rviscomi commented 2 years ago

🎉 This chapter is fully written, reviewed, edited, and ready to be launched on Wednesday! Thank you to all of the contributors who put in the time and effort to make this a great chapter.

rviscomi commented 2 years ago

@meyerweb @j9t @svgeesus @argyleink @una @estelle @LeaVerou @rachelandrew @jabranr @tomhodgins @thecraftysoul

When you get 5 minutes, I'd really appreciate if you could fill out our contributor survey to tell us (the project leads) about your experience. It's super helpful to hear what went well or what could be improved for next time. 🙏

Thank you all again and I'm excited for this to launch soon!