Improve our first contentful paint time

Pulling this out of #634 so we can leave that for discussion of the specific issue of static site generation and to have a more holistic look at improving FCP (first contentful paint).

@rviscomi ran some great analysis in https://github.com/HTTPArchive/almanac.httparchive.org/issues/634#issue-550387090 showing that our FCP has room for improvement

This chart tracks the 30-day trend of CrUX data for four of our pages: Home, CSS, JS, and Table of Contents. The ToC page has the highest % of fast FCP on mobile, at ~80%, although it's been dropping over time. The CSS page, which is quite large in terms of content and assets, has the fewest fast FCP experiences, at about 40%.

Now, personally I think the site is super fast, but I'm aware that might be biased by my experience (modern high end devices, and decent network), plus it's always fun to do some performance analysis to learn some new things (I really need to get a hobby!), so let's dig in.

As Rick's stats are based on Chrome, we'll concentrate on that (iOS is a whole different ball game for various reasons that I won't go into now!). So running a WebPageTest on OnePlus 5 on Chrome on 3g we get the following:

In the filmstrip we can see that the page starts rendering at 2.2 seconds with the sytles and fallback fonts thanks to loading Google Fonts locally with font-display:swap, and it starts loading the hero image (hero_lg.webp), which pretty much finished at 2.4 seconds and just after that the first of the fonts kick in at 2.5 seconds, and everything is done by 3.1 seconds.

So first potential gain that might impact FCP is that we could move to progressive JPEGs to get more of that hero image rendering quicker. That would involve moving away from WebP (added in #603 ) as they cant be rendered progressively. They are slightly smaller but not massive gains.
Second potential gain is to support a medium (hero_md) for mobile with high resolution screens (in either JPG or WebP) as the hero_lg may be too much. There is a good size difference between hero_lg (~62kb) and hero_sm (~15kb) but also a quality difference so we wouldn't want to go all the way down to hero_sm hence suggestion to add a hero_md.

Moving on to the waterfall we get:

We see a large DNS+Connect+SSL time. Not much we can do here I'm afraid. It is uses CBC ciphers, rather than ECDSA so potenitally some gains there in security and performance but not sure what control (if any) we have of that while on Google App Engine infrastrcuture and think gains are minimal to be honest.

Then we download the HTML. There is a 0.4 second TTFB until it starts returning so there maybe benefits to not dynamically generating this as suggested in #634 but I'm really not convinced you're going to get that down by a significant amount. As I said in https://github.com/HTTPArchive/almanac.httparchive.org/issues/634#issuecomment-574905678 I don't think being dynamically generated is holding us up performance wise.

Additionally we're using gzip and not brotli for text-based compression, but that's because of Google App Engine support.

Investigate alternative host provider to support Brotli.

We can also see the chunked sending in the Waterfall of dark blue bits of the response, that is default in HTTP/2 that Google App Engine/Google CDN uses (so answering one of your suggestions from #634 @rviscomi ).

We also see the CSS is requested while the HTML is still downloading (hurray for chunked/h2!) and the assets are being requested in appropriate order with crtical CSS, then above the fold images, then defered javascript...etc.

Fourth potential gain is to merge our 4 CSS files into less files (ideally 1), either in source code, or as part of build step. Not sure there are massive gains to be had here as, while multiple requests as not quite as free in HTTP/2 as initially hoped, they are certainly not as expensive as over HTTP/1.1 and seem to be working well here. We could move print.css into 2019.css as no real reason to keep that separate other than convention, but it's low priority (note Chrome still downloads this even when not printing but with low priority). Again not sure if any real, noticeable gains here.
Fifth potential gain is to inline CSS (though personally I am not a fan of this).

We also see the hero image really is taking a long time to download as discussed above.

Then we move on to fonts, and lower priority images. There's potential to use less fonts but as we're concentrating on FCP we'll ignore that for now. However one thing I will point us is we are needlessly downloading Poppins normal which doesn't happen on master branch, due to a mismatch between master and production branch (#632 ) so the sooner we fix that the better.

Fix #632 to move to 6 fonts instead of 7 in chapters.

After that we could look at things like preload or HTTP/2 push but I don't think they'd help too much. They'd save 1 request time, but download time would still be the same - and it looks like we're requesting the additional resources fairly quickly and in the right order. So I don't think we really should look at this. But throwing it out as an option.

Look at preload or HTTP/2 push - not really recommended.

After the waterfall we see the browser main thread, which is maxed out around the two second mark (processing that big hero image?) and then again after 2.4 seconds (processing fonts and other images).

Pleasingly enough our chapter.js seems to execute very quickly as that was a concern of mine (mainly cause I wrote it!). This basically replaces figure images with interactive charts - but it has been written to fall out as quickly as possible when it sees it's on a mobile to avoid causing a performance issue and that seems to be working.

Looking at the Main-thread processing breakdown it seems to be spending all it's time in layout so again I would guess that's the images and fonts (I'm less familiar with this so if someone knows how to read this better then let me know).

The same test for the ToC pages shows this:

This this waterfall:

Observations:

Here you can see no large image above the fold, which saves some download, and gets the fonts in sooner.
- The HTML is also a lot smaller so downloads quicker and can download in one chunk (but takes same amount of time for TTFB).
- We only use 3 fonts on this page as opposed to the 7 on chapter pages (which should really be 6).
- The Browser Main Thread shows a lot less activity, but then a lot less images and fonts so that makes sense.

So first paint (2.3 seconds) is pretty much the full page, though the Visual Progress Chart shows a slight tweak as the body font (Lato-Regular) kicks in then.

However it's not a massive difference from 2.4/2.5 for CSS page. So I'm surprised that is so much better in CrUX. I honestly think the CrUX data is out of date, despite the timelines on your chart and performance HAS been improved over the change freeze wiht the changes we did to the fonts and images. @rviscomi given that the CrUX data for December is only just available how are you able to see January stats in above chart? Do you have a way of seeing this before it's loaded into BigQuery?

So after that big write up I say we fix 6 (#632) and leave the others for now and see if we can get updated CrUX data in next few weeks.

Still I thought it worthwhile opening this issue to generate discussion!

Thoughts?

That's a comprehensive and insightful write-up above, @bazzadp 👍

Here are some of my thoughts on this matter, hope you'll find it useful:

We should minify the HTML and the CSS output. Also, when it comes to CSS, I'd love to be able to use SASS in the future to organize and nest all the rules, be able to comment out things and to know they will be stripped out at build, group stylings per components, etc.
Merge and/or reorganize existing CSS files where possible. Decide on if we really need e.g. the normalize.css - it's a great resource but I'm not sure we really need these extra KB and one extra HTTP request as we're not using all those CSS resets.
+1 for the progressive JPEGs, I wanted to propose that at a point but then I saw #603 and decided against raising an issue on this.
I appreciate all the work on #607 but, in my opinion, a good idea in the future would be to continue to lessen the number of custom fonts used. The native/system font stack works and looks just fine for most of the OS's. We could use a max of two fonts: one for headings and the other for the body.

Thansk @catalinred !

We should minify the HTML and the CSS output

Never a fan of this to be honest. But that's cause I look at and support production code too much and am not a proper developer 😊 I also think the gains are questionable with gzip/brotli. (unless you have truely massive HTML and CSS, in which case you have bigger issues!).

Also, when it comes to CSS, I'd love to be able to use SASS in the future to organize and nest all the rules, be able to comment out things and to know they will be stripped out at build, group stylings per components, etc.

If that makes it easier then we should look at it. Though again I like to be able to easily map prod site back to source easily, understand there are more benefits here than to just minifying. On the flip side I'd also like to learn SASS more as not used it much.

Merge and/or reorganize existing CSS files where possible. Decide on if we really need e.g. the normalize.css - it's a great resource but I'm not sure we really need these extra KB and one extra HTTP request as we're not using all those CSS resets.

Yeah. I don't think it's too bad, and, as mentioned above, HTTP/2 does make multiple resource less of an issue (though not totally free), but it's a good question - especially for this normalize.css.

+1 for the progressive JPEGs, I wanted to propose that at a point but then I saw #603 and decided against raising an issue on this.

As I say downside is losing WepP but they are not that much smaller. For HTTP/2 chapter, for example, we have the following:

hero_lg.jpg - 27KB
hero_lg.webp - 25KB
hero_sm.jpg - 7KB
hero_sm.webp - 6KB
hero_xl.jpg - 1.6MB (not used)

So WebP is not really saving us that much so if we reverted to JPG to allow progressive images then I think that would be better.

Above also shows the big difference between _sm and _lg and we use _lg on mobile for decent screen sizes when a _md one would probably be smaller and sufficient. Maybe even _sm would do fine even on retina-style mobile screens? Will do some testing...

I appreciate all the work on #607 but, in my opinion, a good idea in the future would be to continue to lessen the number of custom fonts used. The native/system font stack works and looks just fine for most of the OS's. We could use a max of two fonts: one for headings and the other for the body.

Yeah I particularly question the value of the bold and italic version of Lato used in the text. Would the system generated versions of those ("faux fonts" apparently) from the base Lato font, do as they are really only used on the odd word or two here? Again should test that they still stand out enough on common browsers. Bold Poppins is used for headers so maybe keep that, but if we fix #632 then don't need normal Poppins so still only one download of that so down to 2 fonts.

Yeah I particularly question the value of the bold and italic version of Lato used in the text. Would the system generated versions of those ("faux fonts" apparently) from the base Lato font, do as they are really only used on the odd word or two here? Again should test that they still stand out enough on common browsers.

I removed all but the base Lato fonts from 2019.css, so the browser would have to sythesis bold, italic and bolditalic (note, I left Poppins alone) and then I compared it to prod and below is what we see:

The heading (Blend Modes) is in bold and the synthesised bold is difficult to distinguish from the real bold IMHO. Maybe slightly lighter weight on Chrome when comparing side to side?

The italic at the beginning of the third line is bit more noticeable (the synthesised italic runs into the next word more).

However in other places it's a bit more noticeable:

The real fonts (top) definitely look nicer in the table and the synthesised italics is too slanted. The synthesised text is also bigger.

Switching to mobile (iOS XS) we have the following:

Again, not really noticeable at least for bold headings, but the slightly larger italic font has caused the figure caption to expand to three lines. Looking at the other image, it's similar - of the two I prefer the real italics (though not sure I'd be notice this wasn't real if they weren't side by side).

All in all, the fonts do have a difference (particularly for italics). Is it big enough to warrant downloading 6 of them? That's debatable but am sure people with more of an eye for design would disagree.

So I think we could optimise this is we really wanted to, but now I've seen the real italics, I'm not sure I can unsee them. We do want the website to look beautiful and not just optimise the heck out of it to the detriment of design.

My vote is to stick with the fonts after all, despite my early statements (though perhaps remove bold?).

Second potential gain is to support a medium (hero_md) for mobile with high resolution screens (in either JPG or WebP) as the hero_lg may be too much. There is a good size difference between hero_lg (~62kb) and hero_sm (~15kb) but also a quality difference so we wouldn't want to go all the way down to hero_sm hence suggestion to add a hero_md.

Next I did similar analysis to the fonts, but just on mobile this time. What would be the impact of using the hero_sm hero image instead of the hero_lg on my iPhone XS? That's got a high quality screen (although it's not a XS Max phablet) should be the type of screen where you notice a difference.

For a start this image shows the font style issue from #632 where Master is out of sync with Production, so subheadings are not bolded (and synthesised bold italic font is still being used on the right). (edit: since fixed).

Anyway, the image on the right is definitely a little fuzzier. Again we could possibly get away with it, but think we're better sticking with the hero_lg image, or creating an intermediate hero_md one.

However given that one of these images is 68KB and the other is 16KB, there is definitely scope for MOST of it to be delivered much earlier with progressive JPGs, and for the user to get most of the impact (and FCP) much earlier.

Personally I think having 3 different sizes (sm, md, lg) of the hero image, x 2 different formats (JPG, WebP), plus the original large image (= 7 hero images per chapter) is taking things a bit two far. I suggest we stick with 2 different sizes, go back JPG and ditch WebP, but make them progressive (= 3 images per chapter). It wouldn't save in bytes downloaded but would "appear" to download faster.

@bazzadp

Hope you don't mind me chiming in here... I was reading the thread out of interest, but to answer you here with regards to SASS:

Though again I like to be able to easily map prod site back to source easily,

You can look into having Sourcemaps generated during the build process

Hope you don't mind me chiming in here...

Chime away - the more the merrier!

I was reading the thread out of interest, but to answer you here with regards to SASS:

Though again I like to be able to easily map prod site back to source easily,

You can look into having Sourcemaps generated during the build process

Yeah they mostly work, and would do if we do go down this route, but nothing quite like mapping code directly to source! 😀

I'm going to close this. I think the site has been proven to be reasonably performant especially with the caching (though first visitor every 3 hours does take a little hit). Could also improve it further, perhaps with some of the suggestions from https://github.com/HTTPArchive/almanac.httparchive.org/issues/638#issue-550829573 but don't think any are really that urgent and we have better things to work on so happy to close.

HTTPArchive / almanac.httparchive.org

Improve our first contentful paint time #638