acl-org / acl-anthology

Data and software for building the ACL Anthology.
https://aclanthology.org
Apache License 2.0
404 stars 279 forks source link

optimize page loading times #824

Open akoehn opened 4 years ago

akoehn commented 4 years ago

@mjpost mentioned SEO in #434. I had a look at the google search console done month ago and even though our sites are static, they are classified as moderately slow loading. I did some experiments and found out that it is due to our external dependencies -- it is fast once they are in the user's cache but takes some time on the first load.

I propose that we 1) host them ourselves to enable http/2 streaming it whatever and 2) minimize some of those, e.g. only package the part of the fonts we need.

mjpost commented 4 years ago

Hosting ourselves and targeting only what we need sounds good. I wonder if our real problem here, though, is the volume and event pages, which after all can be quite large (e.g., the acl-2019 event) and perhaps not very useful. We could have event pages just link to all the contained volumes, without also concatenating all their papers. (This is what we do for venues).

mbollmann commented 4 years ago

In theory, serving remotely from CDNs is supposed to make things faster overall by allowing more intelligent caching. I know too little about the topic to have a definitive opinion here, but I'm not entirely convinced that hosting jQuery, Bootstrap etc. locally will actually make things faster.

Then again, I didn't run any experiments or benchmarks on this yet. Is it some particular dependency that causes a noticeable slowdown, or is it really just the bulk of them?

akoehn commented 4 years ago

I ran the speed test in google chrome, which is recommended by google IIRC. The problem there is definitely loading the resources, especially academicons (which is -- contrary to what I wrote from memory -- hosted by us).

Recommendations by the chrome audit:

web fonts make more than 200kb of traffic when loading a page, that is quite a lot!

Also, we don't seem to have meta descriptions and use old jquery&bootstrap, which have known security problems (not applicable to our static site, but it complains nonetheless)

mbollmann commented 4 years ago

The CSS is already built by cherry-picking which Bootstrap components to include, so I'm not sure we can optimize it much further without affecting maintainability.

Web fonts, possibly, I'll have to look into it.

Speaking from experience, jQuery & Bootstrap are unfortunately often a pain to upgrade without affecting existing layout & functionality, so while this would be desirable, it would need thorough testing to ensure that nothing breaks.

akoehn commented 4 years ago

I used the Chrome-internal tool, but this web-based one is essentially the same: https://developers.google.com/speed/pagespeed/insights/?hl=de&url=https%3A%2F%2Fwww.aclweb.org%2Fanthology%2FW19-8643%2F

@mjpost could the server gain a sensible caching strategy, at least for the static assets such as fonts, js, and css? I don't know who controls what there, but maybe it can be achieved with a little htaccess magic.

akoehn commented 4 years ago

I used purgecss today and it shrinks the css to about 1/8th of the original size, making page rendering faster both due less network traffic and computation. We are talking about 176kb css that needed to be parsed before.

I don't see why wee load the bootstrap js --it does not seem to be used. We could get rid of popper.js and just use plain tooltips or do we need the fancy ones?

I will push the changes to a branch sometime this week so you can have a look if you want.

mbollmann commented 4 years ago

Bootstrap JS is used for collapsible elements (abstracts in overview pages), and possibly for responsive behaviour of the navbar. (See "Components requiring JavaScript")

akoehn commented 4 years ago

Both work without it, I tried that. null

mbollmann commented 4 years ago

Huh, interesting. I'll try that too when I have the time. In any case the JS shouldn't increase page loading times, as it's loaded after the content, if I understand that right.

Also happy to try the purged CSS version to see if anything looks different.