HTTPArchive / almanac.httparchive.org

HTTP Archive's annual "State of the Web" report made by the web community
https://almanac.httparchive.org
Apache License 2.0
622 stars 182 forks source link

Audit the website for performance issues #413

Closed rviscomi closed 4 years ago

rviscomi commented 5 years ago

Related: #286 and #303. Make sure we're not doing anything silly to hurt performance.

I'm aware that embedding 25 Sheets iframes is probably not a great idea for performance. We can look into preconnecting to the origin and lazy loading the iframes.

The audit should ideally happen before launch just so we know if we're doing anything egregious. Any P0 performance issues should be fixed prior to launch (within 24 hours). Everything else can be resolved after launch (apres ski).

cc performance experts @andydavies @paulcalvano


Performance issues

tunetheweb commented 5 years ago

I'm aware that embedding 25 Sheets iframes is probably not a great idea for performance. We can look into preconnecting to the origin and lazy loading the iframes.

Just noting that lazy loading images (and possibly iframes) is being discussed in https://github.com/HTTPArchive/almanac.httparchive.org/issues/351

rviscomi commented 5 years ago

Yes I definitely think #351 would be helpful.

This issue can tell us how significant the perf impact of not lazy loading is.

tunetheweb commented 5 years ago

Any ideas why Lato font loads so many times?:

image

rviscomi commented 5 years ago

In base.html we load the fonts here: <link href="https://fonts.googleapis.com/css?family=Lato:400,400i,700,700i, 900|Poppins:300,400,700,900&display=swap" rel="stylesheet">.

I think it's loading one font file for each weight/style combo.

rviscomi commented 5 years ago

The home page background image is very large and provides contrast to the white text at the top of the page. Because it's so large, it takes a few seconds to load for the first time, so the text is invisible, except for the yellow buttons:

image

https://www.webpagetest.org/video/compare.php?tests=191110_J7_f4bc44580c3abfae6451fa9b72659012,191110_MK_600ab0b42469c8fb2a03fdbf34132dd6,191110_HG_49bdb23fcbc8952c67d75e0c08a6b198,191110_8T_02272f25cf4fcf6bc73f58e1ea8f2e1e,191110_M6_ec2abd0fd85a56c09523e42337d706e0

We should use a CSS-based background color as a placeholder until the image loads, and/or look into optimizing that background image so it doesn't load as slowly.

rviscomi commented 5 years ago

Images used in the footer and mobile design are loading immediately. Below-fold images should be lazy loaded and mobile-specific images should never load for desktop.

image https://www.webpagetest.org/result/191110_MK_600ab0b42469c8fb2a03fdbf34132dd6/1/details/#waterfall_view_step1

rviscomi commented 5 years ago

Extend the cache lifetime for static images (related #286) image

https://www.webpagetest.org/performance_optimization.php?test=191110_MK_600ab0b42469c8fb2a03fdbf34132dd6&run=1#cache_static_content

rviscomi commented 5 years ago

Mobile-specific data viz is still loaded on desktop chapter pages. I would have thought the display: none style prevented this.

image https://www.webpagetest.org/result/191110_J7_f4bc44580c3abfae6451fa9b72659012/1/details/#waterfall_view_step1

rviscomi commented 5 years ago

Embedded data viz iframes should be lazy loaded until in view (#351).

Author avatars should also be lazy loaded.

image

https://www.webpagetest.org/result/191110_J7_f4bc44580c3abfae6451fa9b72659012/1/details/#waterfall_view_step1

AymenLoukil commented 5 years ago

Extend the cache lifetime for static images image

https://www.webpagetest.org/performance_optimization.php?test=191110_MK_600ab0b42469c8fb2a03fdbf34132dd6&run=1#cache_static_content Already detected in #286

AymenLoukil commented 5 years ago

Also i would add that many resources paths are relative and not absolute. This could make additional time to check protocol / redirect etc.

tunetheweb commented 5 years ago

Also i would add that many resources paths are relative and not absolute. This could make additional time to check protocol / redirect etc.

Nah, don't think that's the case! Relative paths are gooood.

AymenLoukil commented 5 years ago

Also i would add that many resources paths are relative and not absolute. This could make additional time to check protocol / redirect etc.

Nah, don't think that's the case! Relative paths are gooood.

You serious ? ^^

rviscomi commented 5 years ago

I'm not aware of any reason not to use relative paths. It doesn't incur any overhead AFAIK. Browsers can quickly canonicalize the URL on the fly.

Maybe you're thinking of navigating to a page, for example example.com which redirects to https://example.com which redirects to https://example.com/?

tunetheweb commented 5 years ago

You serious ? ^^

Yes. What's the concern here?

You mean it says things like:

<img src="/static/...

As opposed to:

<img src="https://almanac.httparchive.org/static/...

??

AymenLoukil commented 5 years ago

I'm not aware of any reason not to use relative paths. It doesn't incur any overhead AFAIK. Browsers can quickly canonicalize the URL on the fly.

Maybe you're thinking of navigating to a page, for example example.com which redirects to https://example.com which redirects to https://example.com/?

This is ok but i'm talking about resources paths. I believe that since we should be on HTTPS, it would be better to precise the absolute path. Webhint does check for it : https://webhint.io/docs/user-guide/hints/hint-no-protocol-relative-urls/ Google Lighthouse team are thinking to implement this audit too. There is also a security concern but mainly for HTTP websites / apps. @bazzadp

In our case, i don't think it has a visible impact, just a best practice.

tunetheweb commented 5 years ago

https://webhint.io/docs/user-guide/hints/hint-no-protocol-relative-urls/

That means don't do this:

<script src="//www.google-analytics.com/analytics.js" async></script>

As that loads http://www.google-analytics.com/analytics.js when on HTTP version of your site and https://www.google-analytics.com/analytics.js when on HTTPS and better to load same over both as will be referenced in cache for both so saves a double download. Plus better for security (if an assets is available over HTTPS then why not load it over HTTPS!).

It has nothing to do with relative references to links on your own site like the examples I gave in https://github.com/HTTPArchive/almanac.httparchive.org/issues/413#issuecomment-552178172

tunetheweb commented 5 years ago

It has nothing to do with relative references to links on your own site like the examples I gave in #413 (comment)

Well technically I suppose it it could be same thing if we had mixed site (i.e. some pages on HTTP and some on HTTPS) as, for example, specifying our logo always being loaded over HTTPS even on HTTP would prevent a double download. But since we don't have a mixed site, this is a non-issue.

Relative paths allow for cleaner, less-verbose code and launching dev envs while loading assets locally.

AymenLoukil commented 5 years ago

https://webhint.io/docs/user-guide/hints/hint-no-protocol-relative-urls/

That means don't do this:

<script src="//www.google-analytics.com/analytics.js" async></script>

As that loads http://www.google-analytics.com/analytics.js when on HTTP version of your site and https://www.google-analytics.com/analytics.js when on HTTPS and better to load same over both as will be referenced in cache for both so saves a double download. Plus better for security (if an assets is available over HTTPS then why not load it over HTTPS!).

It has nothing to do with relative references to links on your own site like the examples I gave in #413 (comment)

Ok thank you @bazzadp :)

tunetheweb commented 5 years ago

Oh and btw the title of that page is "No Protocol-Relative URLs" not "No Relative URLs"

paulcalvano commented 5 years ago

Just opened an issue about the static content TTL and then realized it's already being tracked here - https://github.com/HTTPArchive/almanac.httparchive.org/issues/423.

catalinred commented 5 years ago

The home page background image is very large and provides contrast to the white text at the top of the page. Because it's so large, it takes a few seconds to load for the first time, so the text is invisible, except for the yellow buttons:

image

https://www.webpagetest.org/video/compare.php?tests=191110_J7_f4bc44580c3abfae6451fa9b72659012,191110_MK_600ab0b42469c8fb2a03fdbf34132dd6,191110_HG_49bdb23fcbc8952c67d75e0c08a6b198,191110_8T_02272f25cf4fcf6bc73f58e1ea8f2e1e,191110_M6_ec2abd0fd85a56c09523e42337d706e0

We should use a CSS-based background color as a placeholder until the image loads, and/or look into optimizing that background image so it doesn't load as slowly.

This something I'd be interested to work on. Is there a separate issue for this?

Also, I'd need the vector source for this file: https://almanac.httparchive.org/static/images/intro-background-fit.png because we also need to make it .svg for HiDPI displays.

rviscomi commented 5 years ago

SVG sounds like a good idea. @mikegeyser was also looking into improving the performance of this image. Did you make any progress, Mike?

catalinred commented 5 years ago

SVG sounds like a good idea. @mikegeyser was also looking into improving the performance of this image. Did you make any progress, Mike?

In case @mikegeyser needs some help here, I'd recommend turning the intro-background-fit.png image into a .svg and then encode it to base64 to avoid the above webpagetest scenario when having white text on white background until the intro-background-fit.png request is fulfilled.

intro-background-fit.png is suitable to convert to a .svg file by not having lots of detail. We'd have a background image from the start so no white text on a white (blank) background and we'd have minus one HTTP request in the background.

What do you think?

mikegeyser commented 5 years ago

I'll definitely take a look at doing that @catalinred. The alternative that I was playing around with on Sunday was using a solid background and lazy loading the image and swapping it. It does mean including some javascript to do the switch, however, which I'm not sure about.

Seems like it's all tradeoffs...

mikegeyser commented 5 years ago

@catalinred @rviscomi I pushed what I was working on as a part of #485. I'm really not tied to that way of doing it, and would like feedback.

We can embed the image as a data uri directly, and that's simpler, but it does seem as there is a non-trivial tradeoff in making the CSS file larger with image data (although that link is super old). This was is at least still cacheable?

Anyway, let me know what you think. :)

rviscomi commented 5 years ago

@bazzadp tackled the lazy loading issues in #484 (ticking those checkboxes as resolved)

tunetheweb commented 5 years ago

Edit: Ignore me. Getting my issues mixed up. Yes Lazy Loading is sorted.

rviscomi commented 5 years ago

No worries. The fix was just deployed live so it's worth rerunning another performance test to validate it.

AymenLoukil commented 5 years ago

I run Lighthouse, Lazy loading seems to work fine now.

This image is big (400 KB ! ) to resize + compress. https://almanac.httparchive.org/static/images/methodology-banner.png

Why not self host contributors avatars ? Erik Nygren's one > 160 KB.

Figures images could be lighter on size.

tunetheweb commented 5 years ago

I run Lighthouse, Lazy loading seems to work fine now.

👌

This image is big (400 KB ! ) to resize + compress. https://almanac.httparchive.org/static/images/methodology-banner.png

I thought it did that automatically?

Why not self host contributors avatars ?

I'm not sure I agree. I think it's nice for people to be able to alter their avatars on GitHub rather than have to do a PR.

Erik Nygren's one > 160 KB. Figures images could be lighter on size.

We load the 200px version. Now on author pages we only load a 60 x 60 pixel one so that's arguably a bit big. But only one image, that's now lazy-loaded, so not a big deal.

However on contributors page we load 100 x 100 and with retina screens, 200 x 200 is probably better. Again they are lazy-loaded so not too concerned.

AymenLoukil commented 5 years ago

We load the 200px version. Now on author pages we only load a 60 x 60 pixel one so that's arguably a bit big. But only one image, that's now lazy-loaded, so not a big deal.

Not always :) since it is shuffling on reload

tunetheweb commented 5 years ago

We load the 200px version. Now on author pages we only load a 60 x 60 pixel one so that's arguably a bit big. But only one image, that's now lazy-loaded, so not a big deal.

Not always :) since it is shuffling on reload

Contributor is shuffling on reload - not chapters. See my second comment for contributors page 😀

AymenLoukil commented 5 years ago

Contributor is shuffling on reload - not chapters. See my second comment for contributors page

Yeah i mean, when Erik Nygren's is displayed on the "Above the fold zone", it won't be lazy loaded.