catchpoint / WebPageTest

Official repository for WebPageTest
Other
3.08k stars 715 forks source link

Increase in start render and SpeedIndex for Chrome since 2017-01-24? #797

Closed soulgalore closed 7 years ago

soulgalore commented 7 years ago

Hi Pat, we have an increase in start render and Speed Index for the URLs we test using Chrome for WebPageTest that started to happened late yesterday (24th). For some URLs it adds 30-40%. It only happens for Chrome and desktop. We cannot see any change in the RUM data we collect and our other testing doesn't pickup any change for start render and SpeedIndex for those URLs.

Here's an example: http://wpt.wmftest.org/video/compare.php?tests=170125_Q3_7Y,170124_SR_K5

We collect the devtools timeline for the tests that get the increased metrics, could the changes that was pushed yesterday be the reason for that?

Anyone else that see the same thing?

Best Peter

pmeenan commented 7 years ago

I rolled the category collection back but the timelines look similar (longer layout and script times than the "good" case you showed) and the render times still look longer: https://www.webpagetest.org/result/170125_RH_a71150463921053d9cc0be8ad4340090/

pmeenan commented 7 years ago

Looks like with timeline disabled entirely the render times are still longer: https://www.webpagetest.org/result/170125_CE_e8ab4e183380a8984f31fea6dc86ca59/

Do you test from multiple locations? Do you see the increase at the same time from all of them? Trying to see if maybe it is a problem with the EC2 instances or something in AWS US east.

soulgalore commented 7 years ago

Hmm yes. I've went through our server log and nothing went out (at least that I could see that could cause that but I could of course have missed something). We only run for one location so that could be a thing, the strange thing is that I only see the diff for Chrome.

We have started to test out our new alerts, where we check three URLs for Chrome/Firefox/IE and to get an alert all three URLs for one browser must reach the configured limit. For Chrome the change is 16, 38 and 25 % higher SpeedIndex compared to yesterday. For the rest of the browsers they are in the span -11% -> 1% change. Of course it could be something with how measure but going back in time at least for a month I cannot see something like this before. We run the tests every hour and they are constantly higher.

Here's another page before/after: http://wpt.wmftest.org/video/compare.php?tests=170125_JK_DH,170124_YT_GT

And then the one with the smallest change (but constantly higher values than yesterday): http://wpt.wmftest.org/video/compare.php?tests=170125_XZ_6G,170124_DG_CT

I run an instance of sitespeed.io checking the same URLs and not seeing the same change (but that is on Digital Ocean).

I'll check the agent instance tonight and get back when the kids are asleep :)

pmeenan commented 7 years ago

FWIW, Chrome 56 is going to be rolling out "really soon" too which is going to introduce more changing (Firefox 51 is rolling out right now). Chrome 55 also had a problem with variable performance (fixed in 56 from some early testing but not positive yet) and I'm wondering if for some reason it just started happening more :-/

Anyway, still looking but I'm not seeing an overall jump in the regular testing I do across a few locations but they have more CPU headroom.

soulgalore commented 7 years ago

We did never see the 55 problem as reported by others, I got a feeling that only happened on smaller instances on AWS?

Your fix actually solved the problem for us, thanks!

screen shot 2017-01-25 at 8 41 58 pm

All three URLs are back to normal. Is that a sign of that we run on a too small AWS instance?

pmeenan commented 7 years ago

Actually, it might be a sign that my before/after where I didn't see the improvement is running on too small of a instance so it is always in the "slower" mode. Your pages in particular tend to be a race to see if Chrome will render before or after the main script goes to get hooked up and apparently the change was just enough to push it past the threshold on your instances.

Glad to hear the change was responsible. I'm going to leave the new categories off and only include them when traces are collected so we should be good going forward.

soulgalore commented 7 years ago

@pmeenan do you mean "main script" = our Javascript or some main script for WPT?

pmeenan commented 7 years ago

Your Javascript. It's more of an aside but something I noticed while working on Chrome. I added logic to Chrome a while back (1-2 years) to yield the parser before starting in-body script if a lot of content had been processed since the last yield to hopefully get it to paint before running late-body script.

It's a bit racy because it depends on layout being ready and actual painting is tied to vsync bot the slower times I'm pretty sure are cases where it didn't catch the early paint cycle and it has to wait until after parse/eval of the script before painting.

pmeenan commented 7 years ago

FYI, Chrome 56 rolled out earlier this afternoon (in case anything moves).