HTTPArchive / almanac.httparchive.org

HTTP Archive's annual "State of the Web" report made by the web community
https://almanac.httparchive.org
Apache License 2.0
613 stars 174 forks source link

Fonts 2020 #902

Closed foxdavidj closed 3 years ago

foxdavidj commented 4 years ago

Part I Chapter 4: Fonts

Content team

Authors Reviewers Analysts Draft Queries Results
@raphlinus @malchata @RoelN @notwaldorf @mandymichael @svgeesus @rsheeter @jpamental @davelab6 @AbbyTsai Doc *.sql Sheet

Content team lead: @jpamental

Welcome chapter contributors! You'll be using this issue throughout the chapter lifecycle to coordinate on the content planning, analysis, and writing stages.

The content team is made up of the following contributors:

New contributors: If you're interested in joining the content team for this chapter, just leave a comment below and the content team lead will loop you in.

Note: To ensure that you get notifications when tagged, you must be "watching" this repository.

Milestones

0. Form the content team

1. Plan content

2. Gather data

3. Validate results

4. Draft content

5. Publication

RoelN commented 4 years ago

Yes, I think it's possible to recognise 99% of FontAwesome fonts being used (the other 1% would be people who optimised/minified/broke their font so much it doesn't contain any metadata or other fingerprints). You could make similar fingerprints for Icomoon, Fontello, Iconic, etc. But you might miss custom implementations of any font that you don't have a fingerprint of. But that might be acceptable?

I wouldn't know how to detect subsetting, apart from counting glyphs/characters in a font and then matching them to how many characters the original font has. But how do you build a database of all the "originals" of all fonts used on the web? You could detect subset usage of subset= and text= of Google Fonts, but that won't work for other services or self hosted fonts.

AbbyTsai commented 4 years ago

Good to have your point of view. Thank you. As so far here is my understanding on where to scan those fonts and wellcome all you in helping clarification. /icon fonts: GoogleFont(icon)+FontAwesome+optimised/minified/broke(hostname) /subsetting: GoogleFont(subset) /vf+animatin: GoogleFont(keyframes/font-variation-setting), OpenType('gvar','fvar.axisTag')

Then, regarding openType, I still don't have much ideas on how to json_extract axis filesize(familyName, Axis, AvgSize, SizeFactor, as like this table) and its feature of Numerals, ligatures, etc? Could any folks help to point out the roadmap on how collecting them? Thanks.^^

AbbyTsai commented 4 years ago

Hey, everyone, Here’s my first metric.sql which calls for all you folks to review and comment. Feel free to rewrite them as you might find much unprofessional behavior. ^^
Thank you.

AbbyTsai commented 4 years ago

@jpamental @davelab6 Would you help to indicate deeply which key axis of filesize with OpenType that might be interesting to metric as you mentioned in outline although I haven't find the road on how to extract them. Also, Feel free to give feedback if you find any metric without meeting the expectation of outline. Thanks.

web-almanac-analysts

Could you give a hand on this query above as I’m new to JS and desire your guidance. Thanks.

davelab6 commented 4 years ago

I propose schedule a call on this in the best few days to sync up (Jason & Abby)

AbbyTsai commented 4 years ago

Awesome, I'm happy to join the meeting. Let me then know your planned time.

foxdavidj commented 4 years ago

I've updated the chapter metadata at the top of this issue to link to the public spreadsheet that will be used for this chapter's query results. The sheet serves 3 purposes:

  1. Enable authors/reviewers to analyze the results for each metric without running the queries themselves
  2. Generate data visualizations to be embedded in the chapter
  3. Serve as a public audit trail of this chapter's data collection/analysis, linked from the chapter footer
LeaVerou commented 4 years ago

Hi there again,

I was wondering if there are any plans to calculate stats for font stacks and popular font names via the font/font-family properties? Looking at the content outline, I couldn't figure out whether this is planned or not (there is something about popular fonts, but not sure if it refers to web fonts only).

I'm asking to see if https://github.com/LeaVerou/css-almanac/issues/15 overlaps (in which case we have one less query to do, yay)

AbbyTsai commented 4 years ago

Hoping 04.11 typeface by country would match our taste.

svgeesus commented 4 years ago

I notice

Formats in use (Woff2, etc) Formats used by percentage?

I'm particularly interested in this: I recently asserted in a discussion that most Webfonts in 2020 are served as WOFF2 (due to widespread browser adoption, ease of conversion, and smallest filesize) and saw a counter-assertion that most are TTF/OTF because "people just throw a font on a server". So I'm very interested to see what the results are here.

Yes, I have seen the 2019 results; but that breakdown is affected by 75% of the results coming from Google fonts. So a differentiated breakdown would be super helpful.

svgeesus commented 4 years ago

Oh, another thing. The 2019 almanac has a section Don't request web fonts if a system font exists which used to be good advice and still is for majority and well-supported languages.

People who speak minority, hard-to-support or unsupported languages though, are used to installing at least one font that supports their language. And that has worked fine for them, except recently Safari stopped allowing local fonts to be used (the reason being to guard against privacy sniffing). That means those users can no longer use Safari (or iOS devices, where WebKit is the only browser engine). Yes, that means Web pages that worked okay for a decade or two suddenly broke for those users.

So the 2020 guide might need to be more nuanced on that topic.

svgeesus commented 4 years ago

Yes, I have seen the 2019 results; but that breakdown is affected by 75% of the results coming from Google fonts. So a differentiated breakdown would be super helpful.

I now see the spreadsheet ad wow, totally different results from 2019 and all sorts of junk MIME types, typos, and so on! Lots to dig into there.

The chart in that spreadsheet worries me, shouldn't the columns add up to 100% or do I misunderstand?

(Yes I know these are preliminary results)

tunetheweb commented 4 years ago

The chart in that spreadsheet worries me, shouldn't the columns add up to 100% or do I misunderstand?

Am sure @AbbyTsai can comment further here if needs be, but the query counts the percentage of pages loading that font type. So 76% of desktop sites load a WOFF2 font and 48% of sites load a WOFF font. This means that somewhere between 24% and 48% of sites load WOFF2 and WOFF fonts on the same page, as you can't have more than 100% as you say.

You'd presume those are for two different fonts, though I have seen instances where the same font was requested in both formats due to misconfiguration!

foxdavidj commented 4 years ago

@jpamental in case you missed it, we've adjusted the milestones to push the launch date back from November 9 to December 9. This gives all chapters exactly 7 weeks from now to wrap up the analysis, write a draft, get it reviewed, and submit it for publication. So the next milestone will be to complete the first draft by November 12.

However if you're still on schedule to be done by the original November 9 launch date we want you to know that this change doesn't mean your hard work was wasted, and that you'll get the privilege of being part of our "Early Access" launch.

Please see the link above for more info and reach out to @rviscomi or me if you have any questions or concerns about the timeline. We hope this change gives you a bit more breathing room to finish the chapter comfortably and we're excited to see it go live!

davelab6 commented 4 years ago

Thank you David, that's great news for me ❤️

AbbyTsai commented 4 years ago

Hello, Jason, Dave, @jpamental @davelab6 Roughly say that the result sheet is almost there as collaborating with Rick, Paul, Barry, thanks content contributors, query reviewers, and welcome any comments.

svgeesus commented 4 years ago

The result that fonts.gstatic.com is the most popular font host in China should be checked, because many Google sites are blocked in China. Although gstatic does not appear on List of websites blocked in mainland China.

The reason is important, because "WebFonts are rarely used in China because the main free font website is unavailable" is a very different story from "WebFonts are rarely used in China because they are huge" or "WebFonts are rarely used in China because the license is too restrictive".

svgeesus commented 4 years ago

Comparing the MIME types and the @font-face format strings, I see that woff2 and woff broadly correspond (75% and 9%) while SVG in format is much higher (6%) than as MIME type (,0.01%). Odd.

svgeesus commented 4 years ago

The font table frequencies are hard to interpret, because the highest possible frequency is 6.97% (for all the mandatory tables). Maybe add another column which is the percentage of fonts containing that table (so head, cmap etc would go to 100%) which allows comparing the frequency of use of the lesser used tables? So for example CFF desktop would be 2,775,483 * 100 / 27,983,775 = 9.92% of all fonts, rather than 0.81% of all tables.

davelab6 commented 4 years ago

Gstatic isn't blocked officially/nationally, but since the gfw is administered regionally, and no servers are within prc, it is often unreliable.

svgeesus commented 4 years ago

Thanks for the clarification, @davelab6

AbbyTsai commented 4 years ago

note that the result sheet has add a table showing a percentage of mandatory tables in opentype as nice point of Chris. thanks.

svgeesus commented 3 years ago

@jpamental @davelab6 is the content being developed in another repo? I had a look at the document to start reviewing, but it seems empty except for the early outline.

davelab6 commented 3 years ago

Jason and I worked on the doc outline, folding in all comments inline, and then it was Thanksgiving and while we both intended to contribute over the holidays, we did not. I'm on a final week of paternity leave but I'll aim to squeeze in an hour or two every night this week to get a fleshed out but unedited article ready for launch.

svgeesus commented 3 years ago

Not trying to rush you, just trying to contribute. I could also help with authoring, as well as reviewing, if you would like.

davelab6 commented 3 years ago

Thank you Chris! Please dive in :)

davelab6 commented 3 years ago

Quick update for everyone following here - I sadly failed to deliver in the last few days, but very graciously @raphlinus has stepped up and will be filling up that draft doc in the next day or two, and I aim to review on Sunday/Monday, such that the first draft can go to Rick and team on Tuesday.

svgeesus commented 3 years ago

I reviewed the PR by @raphlinus which is overall very good. I suggested some additions and corrections.