HTTPArchive / almanac.httparchive.org

HTTP Archive's annual "State of the Web" report made by the web community
https://almanac.httparchive.org
Apache License 2.0
611 stars 170 forks source link

Finalize assignments: Chapter 6. Fonts #7

Closed rviscomi closed 5 years ago

rviscomi commented 5 years ago
Section Chapter Coauthors Reviewers
I. Page Content 6. Fonts @davelab6 @zachleat @hyperpress @AymenLoukil

Due date: To help us stay on schedule, please complete the action items in this issue by June 3.

To do:

Current list of metrics:

Variable fonts:

Others:

Top Fonts:

Formats:

Optimizations:

👉Optional AI (coauthors): Peer reviewers are trusted experts who can support you when brainstorming metrics, interpreting results, and writing the report. Ideally this chapter will have multiple reviewers who can promote a diversity of perspectives. You currently have 1 peer reviewer.

👉 AI (coauthors): Finalize which metrics you might like to include in an annual "state of web fonts" report powered by HTTP Archive. Community contributors have initially sketched out a few ideas to get the ball rolling, but it's up to you, the subject matter experts, to know exactly which metrics we should be looking at. You can use the brainstorming doc to explore ideas.

The metrics should paint a holistic, data-driven picture of the web fonts landscape. The HTTP Archive does have its limitations and blind spots, so if there are metrics out of scope it's still good to identify them now during the brainstorming phase. We can make a note of them in the final report so readers understand why they're not discussed and the HTTP Archive team can make an effort to improve our telemetry for next year's Almanac.

Next steps: Over the next couple of months analysts will write the queries and generate the results, then hand everything off to you to write up your interpretation of the data.

Additional resources:

foxdavidj commented 5 years ago

This is a bit of a crossover with #9 but, for sites using Google Fonts... what percentage of them use 1 font, 2?, 3?, 4? Font here meaning a query for family=Roboto:300,400,500,700 would be 4 different fonts.

Would like to see how this relates to overall page speed as well.

rviscomi commented 5 years ago

@davelab6 @zachleat We're hoping to finalize all chapters today. I still need a couple of things from you. Please go to https://github.com/HTTPArchive and accept the invitation to join the Authors team. This ensures that you get author-specific communications, I can assign issues to you, and you can edit issues like this one. Once you've done that, could you edit https://github.com/HTTPArchive/almanac.httparchive.org/issues/7#issue-446805782 and check off the remaining items on the TODO list when you're happy with them? I've updated it with the latest metrics in the brainstorming doc.

logicalphase commented 5 years ago

@rviscomi - I think we are ready to close this unless we need to wait until invitation accepted. I'm going to close it to stay on schedule, but we can reopen if need be.

rviscomi commented 5 years ago

Tentatively reopening this issue for @davelab6 or @zachleat to confirm the metrics. Haven't seen any comments from them so far, so I just want to make sure they're fully on board.

🛎 Dave/Zach, please chime in if there are any other metrics you think we should be looking at. Also please see https://github.com/HTTPArchive/almanac.httparchive.org/issues/7#issuecomment-498470293 for more info about accepting your pending Author team invitations.

paulcalvano commented 5 years ago

Would be interesting to show

rviscomi commented 5 years ago

@davelab6 @zachleat could you PTAL at your earliest convenience?

rviscomi commented 5 years ago

New modes of typographic expression New ways to make quality text typography

For these metrics, what exactly is the quantifiable measurement? For example we're hoping to get a list of things that represent the state of fonts: web vs system, top fonts, adoption of optimization techniques, etc.

Hoping to have a list of 10 or so of these kinds of "state of fonts" metrics by the end of today. ❤️

davelab6 commented 5 years ago

@rviscomi thanks for the pings on this, I've just accepted the invite and will focus on this today.

rviscomi commented 5 years ago

Thanks Dave, I appreciate your time on this!

zachleat commented 5 years ago

I’ll just leave my pipe-dream ideas of metrics I wish we could have and y’all can just shoot them down:

I think the ones y’all have above are good.

I seriously doubt this is available but I’d love to see info on font file internals too: maybe we can look at OpenType Features from CSS use (though some of these are on by default)? Hinting, kerning, ligature use

rviscomi commented 5 years ago

@davelab6 anything else to add to Zach's list? if not we can close this out.

rviscomi commented 5 years ago

I've added Zach's suggestion to the official list in the top comment. Closing this out.

davelab6 commented 5 years ago

Thanks again for your patience with me on this @rviscomi - hope the below is the kind of thoughts you were seeking.

So, the list at the top was as of this comment:

  • Local vs hosted
  • Popular hosts
  • Font formats
  • Font-Display usage
  • Variable fonts
    • Latency gains on existing families
    • New modes of typographic expression
    • New ways to make quality text typography

For measuring latency gains on existing families, we'd need historical data. We'd then need to start with the list of sites that are using a variable font with a wght axis, and note their latency profile, then look back at historical data and see when the VF was introduced, and what the latency profile was before that.

For measuring an increase in "typographic expression" and "new ways to make quality text typography", I have some proposals for quantifiable measurement. We'd need to know what font-variations-settings properties are in use on the pages that HTTPArchive indexes, as well as the higher level font selectors font-weight, font-stretch and font-size which, when associated with a @font-face variable font, call the VF wght, wdth and opsz axes.

First, we need a list of how many pages in the HTTPArchive link to a variable font via @font-face, and then we get a metric of percent of pages using VFs. This is probably very low as a total percentage, but with historical data we could make a count today and X months ago and then show a % growth over time, which might seem larger ;)

Second, we need to know, of those pages linking to a VF, how many are using the 4 font selectors that select on a variable font family, per page, and per domain. There's likely to be a certain margin of sites linking and not using VFs for styles outside what was available before. It might be interesting to find out, of pages linking to a VF, how many pages use @supports to screen for variations capable browsers and presumably provide fallback typography. To quantify the use of styles newly available, we'd want to know who is using new CSS4 values like font-weight: 555. I also expect font-stretch to have had very little historical usage, and for its use to be climbing rapidly. For font-size, we would need to resolve them all to points, then load the font binaries to read their fvar table data to find out the range of the opsz axis, and then see for each font-size associated with a VF font-family if that size is within the opsz range. If it is then the type designer has designed the font for that usage - "quality text typography" - and if it outside the range, then the type is being scaled without optical adjustment.

Third, if we segment that list of all the font-variations-settings properties into axis:value pairs, and order them by count, we'd know which axes are most commonly used today, and could offer a "top 10 axes" list (or 20, 50, etc.) My guess is that this is going to be a short list, though - less than 100. In that case, we can then take a pass by hand to classify each axis as relevant to 3 sets: expressive typography, text typography, or both. Then, for each set, and for each axis, we can get the parent page URLs, rank by overall site traffic, and take a look at how the axis is being used in practice. We could also segment by font-size if that data is available, so that we know which axes are used 6-20pt, and which are used 20pt+.

Forth, the list of all the font-variations-settings properties can be ranked by number of axes used in concert, and we could chart the ratios of expressive:informative:both axes used together; again, perhaps segmented by size.

For example we're hoping to get a list of things that represent the state of fonts:

  • web vs system

I think "web vs system" and "local vs hosted" are the same concept?

  • top fonts

Just having a rank of the web's most popularly used fonts is interesting; I'd also be interested to see the list segmented by providers - top fonts chosen from the libraries of Google Fonts, AdobeFonts/TypeKit, Cloud.Typography, FontStand, etc - and also the top self-hosted fonts (so subtracting all the mentioned services.)

  • adoption of optimization techniques

I guess this is the next 2 points:

  • Font formats

There hasn't been any new "web font" formats in a long time. What I think may be interesting to see is that SVG and even EOT and raw TTF are going away, replaced with sites only supplying WOFF and WOFF2 formats.

OpenType 1.7 introduced 4 different color font formats, and the colorfonts.wtf site is a great explainer on the details. Seeing a graph of how many sites are using fonts with each of these color tables inside could be interesting - it would mean similarly loading the font binaries to read their table data, although in this case just seeing if certain tables exist.

  • Font-Display usage

Font-Display is great, and it would be very interesting to see which values are being set for that property over time :)

Are there other optimization techniques? Perhaps, placing a single Google Fonts <link> element as the very first element with <head>, as demonstrated but not explicitly recommended on the Google Fonts Getting Started page?

@paulcalvano,

  • the number of custom fonts per page

That also could be quite interesting - especially the tip of the head of that graph... The web's most fontastic page :D

  • Font preloading

This would also be great to include in the survey of the state of optimization techniques adoption.

rviscomi commented 5 years ago

Epic. Thanks @davelab6!

davelab6 commented 5 years ago

I see @rviscomi added a few more ideas from @zachleat to the top list of metrics. Some comments :)

  • how many fonts are loaded but also how many type-faces (families) are used

Yes, I think that will be interesting; I suspect a lot of people link to fonts in their HTML, but don't actually use them in their CSS (as it were.) However, I believe that only old bad MSIE versions would actually download unused font data, lol

  • Related, group by weight/style: how many people use italics? those are often left off

I suspect that sadly the number of people who do not use any Italics, when they are available, is rather high - perhaps on latency concerns - or perhaps because they see <i> (and even <b>) working "as expected" but actually getting auto-slanted and auto-bolded versions of their regular roman style.

  • Font formats (how many people are still using Bulletproof font face syntax?; WOFF2 use specifically)

There were so many versions of Bulletproof syntax; I wonder that perhaps it might be interesting to catalog a few styles of Bulletproof that are out there; the one with the smiley unicode and such.

As I mentioned, I know nowadays that some people like to just do WOFF and WOFF2 only.

  • Icon fonts (not sure how to measure this, might show up if we measure popular families?)

I'm pretty some FontAwesome has AWESOME popularity and will show up that way. I expect Material Design Icons may also rank high enough to show up that way. A whitelist of known icon font family names would not be to hard to come up with; I'm happy to run down a sheet of even a few thousand @font-face font-family property values and categorize them by hand.

  • CSS Font Loading API use?

Yes, would be great to see how font preloading is being adopted.

  • unicode-range use (and range size, perhaps to glean some info on subsetting)

YES! Range size would be very interesting.

I would be very keen to see some kind of heatmap of how often unicode characters are included in unicode-ranges; the size of unicode can be quite challenging to get one's head around - eg, https://ian-albert.com/unicode_chart/ has a "complete" Unicode chart as a 100mb TIFF (22,017 × 42,807 pixels :)

unichart-printed

  • uses preconnect for web font cdn? popular preconnect domains?

Very interesting!

  • Use of local() in src

Also very interesting!

Again, I think for all these, segmenting the web fonts services from self-hosted folks doing their own thing would be interesting. Google Fonts has been using unicode-range extensively for years and its the underlying/enabling technology for Google Fonts' CJK web fonts, and I am very curious what other people do; whereas the nature of Google Fonts's architecture, with the API on googleapis.com and the TTFs on gstatic.com, means preconnect isn't possible.

davelab6 commented 5 years ago

@rviscomi I've combed through what I posted and added a more succinct list of the metrics I proposed to the top of this issue. Please edit further as needed :)

davelab6 commented 5 years ago

For font-size, we would need to resolve them all to points

I actually don't think I've ever seen that stat anywhere - a bar chart showing the most popular pt sizes on the web. That maybe belongs in another chapter?

davelab6 commented 5 years ago

@zachleat thanks for those ideas! Sorry I initially missed you were the author of the points Rick added to the top comment =)

rviscomi commented 5 years ago

Adding @AymenLoukil as a reviewer!