ProjectSidewalk / SidewalkWebpage

Project Sidewalk web page
http://projectsidewalk.org
MIT License
84 stars 24 forks source link

Improve website loading times #1338

Open misaugstad opened 5 years ago

misaugstad commented 5 years ago

Related to #578 which is worth reading for anyone trying to address this issue.

I just tried loading the landing page and the audit page on my laptop at https://sidewalk.cs.washington.edu and both pages took about 9 seconds to load. At https://sidewalk.umiacs.umd.edu it takes about 7 seconds. In both cases, the load times are too slow, but there is also something about our set up at UW that is slowing the load time significantly.

Luckily, actually auditing does not seem to be a problem for me (although others should see how auditing goes on different internet speeds).

Some tactics we might pursue to decrease load times:

jonfroehlich commented 5 years ago

One thing about cs.washington.edu is it sits on 'Waiting for sidewalk.cs.washington.edu' for a very long time before starting to load the page (see bottom left of screenshot)

image

jonfroehlich commented 5 years ago

@paarig brought this up again and said "I could even see people loading this site and then leaving (because they think it's down or just don't want to deal with a laggy site)." I agree and I think someone needs to investigate this and try to resolve it before our Newberg deployment.

misaugstad commented 5 years ago

I agree. @jonfroehlich who do you think might have the expertise to really succeed at tackling this? @athersharif maybe?

jonfroehlich commented 5 years ago

Yes, definitely @athersharif. But I worry about weighing him down as we still have to finalize the Docker sidewalk setup (his task) and he primarily needs to focus on on temporal tracking urban accessibility.

There is a new student that I'm interviewing next week who has a ton of dev experience... he could br good and this might be a reasonable first task.

Update: Also, didn't we talk about @athersharif looking at trying to help you update the libraries in the codebase? This is also one of those complex tasks, which take an unpredictable amount of time.

tongning commented 5 years ago

Supplementing @jonfroehlich's screenshot, I re-ran Google Pagespeed Insights and it looks like by far the biggest bottleneck is waiting for the server to respond. I think this points to a problem of the server taking a long time to generate the page, as opposed to the content of the page or the way the content is transferred (though there are certainly optimizations to be made there as well per #578). I'm wondering if there are any costly database queries in generating the front page that are slowing things down?

image

misaugstad commented 5 years ago

Okay yeah I found something. The auditedStreetDistance function is being called 6 times (3 of those times by itself, and 3 of those times are from within streetDistanceCompletionRate). And the function totalStreetDistance is being called within the streetDistanceCompletionRate function as well. The auditedStreetDistance and totalStreetDistance functions take ~0.2 seconds each on my local machine. If we passed both in as parameters and did the division that streetDistanceCompletionRate does right there in the HTML, we would be shaving off a total of 7 function calls (5 from auditedStreetDistance and 2 from totalStreetDistance), saving a total of 1.4 seconds.

The server may be faster than my local machine, but I imagine we would still be speeding up page load time by at about a second on the server.

andrew4699 commented 5 years ago

Here are things we can improve (in descending order of importance):

  1. Cache totalStreetDistance and auditedStreetDistance. I implemented a super bad test cache in < 10 LoC for those 2 and my page loading time went down by 5x. Do we have any caching solution (or global data store) currently in place, either for Postgres or Scala? If not then I can look into what the best approach would be. We should cache in the model rather than the view because that would violate MVC principles.
  2. Enable GZIP. Low time investment for a moderate reward.
  3. Defer some of the JS & CSS.
  4. Remove unused CSS rules in the build process.
  5. Optimize images (size/next-gen format/encoding).

1 is absolutely necessary and will already do a great deal of savings. Considering our time constraints, do we want to save 2-5 for later, assuming that the initial load time after doing 1 is ~2 seconds?

jonfroehlich commented 5 years ago

Nice breakdown Andrew. Thanks. Let’s pursue the first option and see how far that gets us.

Sent from my iPhone

On Jan 23, 2019, at 10:58 PM, Andrew Guterman notifications@github.com wrote:

Here are things we can improve (in descending order of importance):

Cache totalStreetDistance and auditedStreetDistance. I implemented a super bad test cache in < 10 LoC for those 2 and my page loading time went down by 5x. Do we have any caching solution currently in place, either for Postgres or Scala? If not then I can look into what the best approach would be. We should cache in the model rather than the view because that would violate MVC principles. Enable GZIP. Low time investment for a moderate reward. Defer some of the JS & CSS. Remove unused CSS rules in the build process. Optimize images (size/next-gen format/encoding). 1 is absolutely necessary and will already do a great deal of savings. Considering our time constraints, do we want to save 2-5 for later, assuming that the initial load time after doing 1 is ~2 seconds?

β€” You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

misaugstad commented 5 years ago

Oops I only responded to the question about caching on Slack, so I'll paste what I said here:

check app/controllers/ProjectSidewalkAPIController.scala I think there is some caching used there

andrew4699 commented 5 years ago

See this branch; https://github.com/ProjectSidewalk/SidewalkWebpage/tree/1338-nextcity-improve-load-time.

A simple 10 minute cache is implemented for auditedStreetDistance and totalStreetDistance. Note that this means some of the stats on the home page will update every 10 minutes. The 10 minutes does not get reset (I tested this) when the cached value is accessed.

Can we deploy this specific branch onto the test server to test performance?

paarig commented 5 years ago

@misaugstad @andrew4699 Saw this and thought I might add that we could use localStorage to cache on the client side. I know that's what was used to cache relevant data on the project I worked on this summer.

jonfroehlich commented 5 years ago

But that doesn't help with the initial load time, which is most important. Plus, caching server side helps everyone. Caching in local storage only helps the client.

On Thu, Jan 24, 2019 at 3:55 PM Paari Gopal notifications@github.com wrote:

@misaugstad https://github.com/misaugstad @andrew4699 https://github.com/andrew4699 Saw this and thought I might add that we could use localStorage to cache on the client side. I know that's what was used to cache relevant data on the project I worked on this summer.

β€” You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ProjectSidewalk/SidewalkWebpage/issues/1338#issuecomment-457403288, or mute the thread https://github.com/notifications/unsubscribe-auth/ABi-9fx9oK8Q287ZnQibNlj35GkMBX0eks5vGkfngaJpZM4YaXlY .

-- Jon Froehlich Associate Professor Paul G. Allen School of Computer Science & Engineering University of Washington http://www.cs.umd.edu/~jonf/ http://makeabilitylab.io @jonfroehlich https://twitter.com/jonfroehlich - Twitter

athersharif commented 5 years ago

sorry, late to the party. fortunately, Chrome provides plenty of debugging tools to debug page load times πŸ™‚. see below:

screen shot 2019-01-24 at 4 00 46 pm

the main video file is 3.7MB (ouch!), takes a good 2.27 seconds to load (bigger ouch!). i think our first step towards this issue should be to address that πŸ˜† the second would be to see if we can somehow optimize the leaflet tiles. there are plenty of them averaging like a 100kb (oof) each. can we cache those?

@andrew4699 @paarig love the ideas on caching - totally should explore that (tho i'm a bit biased towards sessionStorage vs localStorage) πŸ™‚! and not just the client, but the server side too (something like redis). edit: just read @jonfroehlich's comment - its πŸ’―

lastly, should use some sort of a loading icon or something and lazy load the assets. UX wise it's just better for people to see something is loading than a black screen πŸ˜†

misaugstad commented 5 years ago

@andrew4699 we won't put specific branches on the test server very often (I don't want to both @mechanicjay more than we have to). But you can always submit a PR, we merge and test on the test server, and then undo the changes if we need to.

andrew4699 commented 5 years ago

The goal is to reach the critical rendering path ASAP (which is where the user sees an "ugly" version of what they need). The order in which CSS/JS files are loaded, their placement within the HTML, and whether they're asynchronously loaded affects that. However, the size is less of an issue because the page is shown before most things are done downloading. That's why I put (4) and (5) at the bottom of my priority list.

@misaugstad How should we proceed then? The branch only has (1) implemented because we want to see how far that gets us. One idea is to create separate issues for each of the 5 things in the list.

athersharif commented 5 years ago
screen shot 2019-01-24 at 4 16 46 pm

think this is a better screenshot (the previous one was sorted by load time, this one is by size). there's a network call to https://sidewalk.cs.washington.edu/neighborhoods that returns 1.6mb of a json file - can definitely use redis for that.

misaugstad commented 5 years ago

@andrew4699 I would say to submit a PR, we test and merge, decide whether it is fast enough. If it is fast enough, we close the issue, if it isn't, you can try something else to further improve load time and submit a new PR. Rinse and repeat :grin:

andrew4699 commented 5 years ago

@jonfroehlich @misaugstad How do you feel about the PR that went onto the test server? Is it fast enough or should I continue down my list of improvements?

jonfroehlich commented 5 years ago

What are the pros and cons of enabling gzip?

Sent from my iPhone

On Jan 26, 2019, at 9:35 PM, Andrew Guterman notifications@github.com wrote:

@jonfroehlich @misaugstad How do you feel about the PR that went onto the test server? Is it fast enough or should I continue down my list of improvements?

β€” You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

andrew4699 commented 5 years ago

Pros:

Cons:

jonfroehlich commented 5 years ago

Can you investigate a bit more and report back?

Thanks!

Sent from my iPhone

On Jan 27, 2019, at 10:23 AM, Andrew Guterman notifications@github.com wrote:

Pros:

Time to download assets (HTML/CSS/JS/images/videos) is reduced Enable once, applies to everything we add to the website in the future Cons:

I need to figure out exactly how our web requests are served and learn how to configure Play β€” You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

andrew4699 commented 5 years ago

GZip ended up being very easy to implement because it's a feature of Play that we just needed to enable. I have a local commit with it enabled but I'm not sure what happens if I push to this branch since it has already been merged. Should we make this a separate issue/branch?

misaugstad commented 5 years ago

@andrew4699 it will be fine if you push to this branch. you can then create another PR to merge this branch into develop again, no problem.

jonfroehlich commented 5 years ago

@andrew4699, did you observe any speedup changes with gzip enabled? Can you post some analytics?

andrew4699 commented 5 years ago

I didn't notice any change in loading time, but the download size decreased from 16MB to 13MB (about a 20% savings). Generally, gzip will largely benefit those with slower internet connections and those with fast connection times don't notice a difference. Plus, it's already implemented so no additional work is necessary.

jonfroehlich commented 5 years ago

Sure, sounds good. And I imagine we're not getting a huge decrease because the images and video are already compressed...

On Wed, Jan 30, 2019 at 1:34 PM Andrew Guterman notifications@github.com wrote:

I didn't notice any change in loading time, but the download size decreased from 16MB to 13MB (about a 20% savings). Generally, gzip will largely benefit those with slower internet connections and those with fast connection times don't notice a difference. Plus, it's already implemented so no additional work is necessary.

β€” You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ProjectSidewalk/SidewalkWebpage/issues/1338#issuecomment-459118211, or mute the thread https://github.com/notifications/unsubscribe-auth/ABi-9Ue1HIrWDFulLymuN6lA_RxLcgKYks5vIg_jgaJpZM4YaXlY .

-- Jon Froehlich Associate Professor Paul G. Allen School of Computer Science & Engineering University of Washington http://www.cs.umd.edu/~jonf/ http://makeabilitylab.io @jonfroehlich https://twitter.com/jonfroehlich - Twitter

andrew4699 commented 5 years ago

I think more optimization should be done. On UW wifi it takes ~4 seconds to see the text at the top of the landing page for first time users. On "Fast 3G" wifi it takes ~8 seconds. If we continue with (3) then I should be able to get more important things like that text to appear first.

jonfroehlich commented 5 years ago

Sure. Let’s try (3) then!

Sent from my iPhone

On Jan 30, 2019, at 6:50 PM, Andrew Guterman notifications@github.com wrote:

I think more optimization should be done. On UW wifi it takes ~4 seconds to see the text at the top of the landing page for first time users. On "Fast 3G" wifi it takes ~8 seconds. If we continue with (3) then I should be able to get more important things like that text to appear first.

β€” You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

andrew4699 commented 5 years ago

How much quality are you willing to lose from the hero video? The following version saves 2MB: https://puu.sh/CFhS2/e797f408fd.webm (can't upload videos to GitHub).

jonfroehlich commented 5 years ago

I'd prefer not to lose any quality.

On Thu, Jan 31, 2019 at 1:09 PM Andrew Guterman notifications@github.com wrote:

How much quality are you willing to lose from the hero video? The following version saves 2MB:

https://camo.githubusercontent.com/16777da80b1e8bcdfc72da9800164eb3dfab1051/68747470733a2f2f7075752e73682f4346684f452f386134306435353361652e676966

(Note that this is a gif of a video so it can be unclear. Here's the actual video file: https://puu.sh/CFhS2/e797f408fd.webm)

β€” You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ProjectSidewalk/SidewalkWebpage/issues/1338#issuecomment-459507120, or mute the thread https://github.com/notifications/unsubscribe-auth/ABi-9eGVWdslM-0Zm92Uxe1Bvd2Z9i34ks5vI1tmgaJpZM4YaXlY .

-- Jon Froehlich Associate Professor Paul G. Allen School of Computer Science & Engineering University of Washington http://www.cs.umd.edu/~jonf/ http://makeabilitylab.io @jonfroehlich https://twitter.com/jonfroehlich - Twitter

mechanicjay commented 5 years ago

A note or two on caches. 1) I note that there are no cache-control headers getting sent, so clients need to go back to the server a lot, needlessly. Adding a cache-control header at the front-end proxy is trivial. It doesn't help initial page load, but should help immensely when navigating around the site. That would also alleviate some load from the back-end, allowing it concentrate on the important work.

2) We should also be able to setup an in-memory cache at the front-end proxy for static resources (js/css/images), that would get those resources out to the clients fast (on first load) and again lets the back-end do the important work. I've not implemented one of these in apache yet, but the stuff built into the new version looks quite promising and I'd be happy to explore this option if it's deemed necessary.

jonfroehlich commented 5 years ago

Both those things sound good. How much work is this to do?

On Thu, Jan 31, 2019 at 2:57 PM mechanicjay notifications@github.com wrote:

A note or two on caches.

1.

I note that there are no cache-control headers getting sent, so clients need to go back to the server a lot, needlessly. Adding a cache-control header at the front-end proxy is trivial. It doesn't help initial page load, but should help immensely when navigating around the site. That would also alleviate some load from the back-end, allowing it concentrate on the important work. 2.

We should also be able to setup an in-memory cache at the front-end proxy for static resources (js/css/images), that would get those resources out to the clients fast (on first load) and again lets the back-end do the important work. I've not implemented one of these in apache yet, but the stuff built into the new version looks quite promising and I'd be happy to explore this option if it's deemed necessary.

β€” You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ProjectSidewalk/SidewalkWebpage/issues/1338#issuecomment-459539018, or mute the thread https://github.com/notifications/unsubscribe-auth/ABi-9VcXm9BW8BH-GWQlpZwi8Oa_9onbks5vI3TCgaJpZM4YaXlY .

-- Jon Froehlich Associate Professor Paul G. Allen School of Computer Science & Engineering University of Washington http://www.cs.umd.edu/~jonf/ http://makeabilitylab.io @jonfroehlich https://twitter.com/jonfroehlich - Twitter

andrew4699 commented 5 years ago

I got about a 2 second improvement on local, "Fast 3G" first load by rearranging the load-order of some assets and getting rid of some unused assets on the landing page.

We can add cache-control headers on our end if we decide to go with it. I'm experimenting with them at the moment and it seems like Play has a way to do it but it's poorly documented.

Would the in-memory cache be done through the web server?

misaugstad commented 5 years ago

Oh I just realized that we had some potential speedups that @andrew4699 was working on that we sidelined. @andrew4699 do you think there is a promising speedup here that you wanted to finish up working on and submit a PR for? Or do you think we've done the most important stuff and we should move on?