apache / incubator-pagespeed-mod

Apache module for rewriting web pages to reduce latency and bandwidth.
http://modpagespeed.com
Apache License 2.0
697 stars 159 forks source link

defer_js loads javascript twice #1054

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 9 years ago
Catchpoint is reporting that defer_js is double-loading javascript files: 
http://blog.catchpoint.com/2015/03/18/pagespeed-third-party-issue/

Original issue reported on code.google.com by jefftk@google.com on 19 Mar 2015 at 4:41

GoogleCodeExporter commented 9 years ago
Trying to reproduce with webpagetest:

Working fine:
  Chrome Stable: http://www.webpagetest.org/result/150319_XK_1000/1/details/
  Firefox: http://www.webpagetest.org/result/150319_9K_1008/1/details/

Double Loading:
  Safari: http://www.webpagetest.org/result/150319_QG_100C/1/details/

Single Loading, early:
  Chrome Canary: http://www.webpagetest.org/result/150319_3B_ZZT/1/details/
  IE11: http://www.webpagetest.org/result/150319_PY_100H/1/details/
  IE10: http://www.webpagetest.org/result/150319_78_100Q/1/details/
  IE9: http://www.webpagetest.org/result/150319_SF_100V/1/details/
  IE8: http://www.webpagetest.org/result/150319_4F_1011/1/details/

On all of them the js still only executes once and is still deferred in 
execution even if it loads early or twice.

Original comment by jefftk@google.com on 19 Mar 2015 at 4:57

GoogleCodeExporter commented 9 years ago
Here's a simpler test case:

    http://www.jefftk.com/type-psa-js?PageSpeed=off

Which has:

    <script src="type-ps-js.js" type="psa/js"></script>
    <script type="psa/js">
    document.write("<h1>Inline: This shouldn't appear</h1>");
    </script>

Original comment by jefftk@google.com on 19 Mar 2015 at 5:03

GoogleCodeExporter commented 9 years ago
Here are nine successive runs on 
www.modpagespeed.com/defer_javascript.html?ModPagespeed=on&ModPagespeedFilters=d
efer_javascript with Chrome: http://www.webpagetest.org/result/150319_9W_10E0/

None of them show double loading, and they all show rewrite_javascript.js being 
loaded late.

Original comment by jefftk@google.com on 19 Mar 2015 at 5:17

GoogleCodeExporter commented 9 years ago
Here are nine runs of Chrome on http://www.jefftk.com/type-psa-js?PageSpeed=off 
http://www.webpagetest.org/result/150319_HW_10WF/

And nine runs of Safari: http://www.webpagetest.org/result/150319_3Q_10WJ/

None of them show loading of the type=psa/js external script file.

Original comment by jefftk@google.com on 19 Mar 2015 at 5:19

GoogleCodeExporter commented 9 years ago
I wonder if it's browser-version dependent?

Original comment by jmaes...@google.com on 19 Mar 2015 at 5:26

GoogleCodeExporter commented 9 years ago
Here are nine runs of Chrome Canary on 
http://www.jefftk.com/type-psa-js?PageSpeed=off 
http://www.webpagetest.org/result/150319_7J_11WT/

All of them show loading of the type=psa/js external script file.  So this 
sounds like a recent Chrome change?

Original comment by jefftk@google.com on 19 Mar 2015 at 5:35

GoogleCodeExporter commented 9 years ago
Actually, not all of the Canary runs.  Runs #2 and #4 didn't load it.

Original comment by jefftk@google.com on 19 Mar 2015 at 5:37

GoogleCodeExporter commented 9 years ago
There are going to be some races depending on whether the preload scanner runs 
at all.  If we're never blocked on js, for example, I think it might not run.  
So here's a new test that does some js busywaiting before listing any psa/js 
scripts: http://www.jefftk.com/type-psa-js2?PageSpeed=off

Original comment by jefftk@google.com on 19 Mar 2015 at 5:56

GoogleCodeExporter commented 9 years ago
Possibly relevant Chrome bug: 
https://code.google.com/p/chromium/issues/detail?id=329531

Original comment by jefftk@google.com on 19 Mar 2015 at 5:56

GoogleCodeExporter commented 9 years ago
Here are WebPageTest runs on first Chrome and then Chrome Canary for the new 
test page: http://www.webpagetest.org/result/150319_DD_12NC/ 
http://www.webpagetest.org/result/150319_MJ_12MZ/ 

The new test page doesn't seem to be eliciting different behavior from the old 
one.

Original comment by jefftk@google.com on 19 Mar 2015 at 5:59

GoogleCodeExporter commented 9 years ago
From the discussion on 329531 it sounds like the preload scanner may or may not 
initiate fetches for scripts with unknown types, but this is actually ok.  
Really we just don't want duplicate loads, and the only one of those I've been 
able to reproduce was that Safari run.

Original comment by jefftk@google.com on 19 Mar 2015 at 6:01

GoogleCodeExporter commented 9 years ago
Running more Safari tests.  On http://www.jefftk.com/type-psa-js2?PageSpeed=off 
it doesn't fetch the external js file at all, which is reasonable. 
http://www.webpagetest.org/result/150319_HB_1313/

On 
www.modpagespeed.com/defer_javascript.html?ModPagespeed=on&ModPagespeedFilters=d
efer_javascript where we got the double load before it gets a double load every 
time.

http://www.webpagetest.org/result/150319_YF_131A/

Original comment by jefftk@google.com on 19 Mar 2015 at 6:05

GoogleCodeExporter commented 9 years ago
Testing in Chrome Canary on my mac I'm seeing double loading in devtools 
(attached) on 
http://www.modpagespeed.com/defer_javascript.html?ModPagespeed=on&ModPagespeedFi
lters=defer_javascript

My guess is that the preload scanner initiates a fetch, and then defer_js 
initiates a fetch, and Chrome's not properly figuring out that there's already 
an active fetch for that resource.

One difference between the two requests is that the first one has its Accept 
header set to "*/*" while the second has "image/webp,*/*;q=0.8".

Original comment by jefftk@google.com on 19 Mar 2015 at 9:44

Attachments:

GoogleCodeExporter commented 9 years ago
Actually, I just realized that the devtools response shown in #13 indicates 
three loads of rewrite_javascript.js.  It's possible that I'm getting bad 
results from reloading with Shift+Ctrl+R.

Original comment by jefftk@google.com on 19 Mar 2015 at 9:46

GoogleCodeExporter commented 9 years ago
Yes, after clearing cache we still see three entries in devtools but the second 
two are from cache.  This kind of makes sense: I see why we would get one for 
the preload scanner and then one when we change the type, but I don't see why 
we would get a third one.

Original comment by jefftk@google.com on 19 Mar 2015 at 9:52

Attachments:

JamoCA commented 9 years ago

I encountered this while using IISpeed and was able to prevent some third party resources (jscache.com & tripadvisor.com) from being loaded twice by adding the "Disallow" directive. https://developers.google.com/speed/pagespeed/module/restricting_urls

pagespeed Disallow "*jscache.com*";
pagespeed Disallow "*tripadvisor.com*";

We use a CMS, so it would be difficult to identify and add all third-party rules as directives to the config file. Previously adding pagespeed_no_defer="" to scripts and resources was enough, but this seems to be ignored. (As a result, we've disabled "defer" features altogether.)

I use the options to combine JS & CSS. I had to disable "InPlaceResourceOptimization". It would correctly combine original files into a single file and then additionally call the original file separately.

jeffkaufman commented 8 years ago

Moved blank js issue to #1325, since it turns out not to be related.

jeffkaufman commented 8 years ago

Took a packet capture of chrome talking to a "frozen" copy of this (pagespeed not running, just pagespeed's output) with tcpdump. I see:

c: request for html
s: html text
c: request for example1
s: example1 text
c: request for example2
c: request for js_defer
s: example2 text
s: js_defer text
c: request for example1
c: request for example2
s: example1 text
s: example2 text
c: request for example2
c: example2 text

(This matches what chrome reports in the network tab in developer tools.)

Note that the first requests for example1 and example2 go out before js_defer has even been requested, which suggests that labeling with type=psajs doesn't keep the preload scanner from fetching it.

Here's the packet capture (gzip manually disabled on the server): current3.txt

jeffkaufman commented 8 years ago

To test this, I made a simple page with no defer javascript at all:

<body>
<script src="example1.js" type="text/psajs" orig_index="0"></script>
<script src="example2.js" type="text/psajs" orig_index="1"></script>

Loading this with chrome, example1 and example2 still gets fetched, presumably by the preload scanner since they don't get run. WPT agrees.

packet capture: current4.txt

jeffkaufman commented 8 years ago

My current hypothesis is that we're getting triple loads because:

1) one load from the preload scanner 2) another load from defer-js's image-based prefetch 3) another load from defer-js, for real

jeffkaufman commented 8 years ago

To test this, I modified defer_js:

deferJsNs.DeferJs.prototype.attemptPrefetchOrQueue = function(url) {
//  if (this.isWebKit()) {                                                       
//    new Image().src = url;                                                     
//  }                                                                            
};

Now I see:

c: request html
s: html text
c: request example1
c: request example2
c: request js_defer
s: example1 text
s: example2 text
s: js_defer text
c: request example1
c: request example2
s: example1 text
s: example2 text

This is consistent with the hypothesis, since with image-based preloading available only did one load of each js file after running defer-js instead of two.

packet capture: no-image-prefetch.txt

jeffkaufman commented 8 years ago

What's not clear to me is why Chrome would make three requests for the same url, instead of re-using what's in its cache. Why would you want a preload scanner to throw away its results?

jeffkaufman commented 8 years ago

I'm also not sure whether we want to thwart the preload scanner here. The reason to thwart it is that because we're deferring the execution of the js the browser is wrong to heavily prioritize loading it since it won't actually be blocking things. The reason not to thwart it is that simply loading the js isn't that bad, we want to preload it somehow anyway, and for page loading the important thing is keeping the incoming pipe full.

jeffkaufman commented 8 years ago

I had thought that maybe the preload scanner results getting re-fetched was because we were using different headers, but diffing them they are both exactly:

Accept:*/*
Accept-Encoding:gzip, deflate, sdch
Accept-Language:en-US,en;q=0.8,es;q=0.6
Cache-Control:no-cache
Connection:keep-alive
Host:www.trycontra.com
Pragma:no-cache
Referer:http://www.trycontra.com/djs-frozen.html
User-Agent:Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.112 Safari/537.36
jeffkaufman commented 8 years ago

I want to get a simple repro case for the preload scanner so I can talk to chromium and figure out if it's a bug, but my first try failed to trigger a duplicate load:

<script id=a type=invalid src=example.js></script>
<script id=b>
initial_script = document.getElementById("a");
s = document.createElement("script");
s.setAttribute("src", "example.js");
s.setAttribute("type", "text/javascript");

this_script = document.getElementById("b");
this_script.parentNode.insertBefore(s, this_script);
</script>

capture: repro1.txt

jeffkaufman commented 8 years ago

(@crowell just checked, and firefox doesn't see either kind of duplicate load)

jeffkaufman commented 8 years ago

The preload scanner duplicate load doesn't happen in safari.

packet capture: safari4.txt

jeffkaufman commented 8 years ago

It looks like two things are going on:

1) Our image-based preloading doesn't work any more, and we should remove it 2) Even with (1) fixed, we have a duplicate load on Chrome (but not Safari or Firefox) where it looks like the preload scanner is ignoring what it already has when the type of the script tag is different.

We can easily fix (1) ourselves. I've posted on net-dev@chromium to figure out if (2) is something we can get fixed in Chrome: https://groups.google.com/a/chromium.org/forum/#!topic/net-dev/PpITTXI7VVs

jeffkaufman commented 8 years ago

As requested on net-dev, filed a crbug: https://bugs.chromium.org/p/chromium/issues/detail?id=623109

jeffkaufman commented 8 years ago

In debugging this with the Chromium folks they pointed out that these scripts should be loaded from cache, and the extra network loads only happen when you disable the cache. The cache is disabled if you use Ctrl+Shift+R or if you have devtools open with Disable Cache checked, and I had both of these.

As long as you're running normally, without disabling the cache, this shouldn't cause any additional network loads in practice. My guess at this point is that all the results that did show network loads were results run with Ctrl+Shift+R.

jeffkaufman commented 8 years ago

We could "fix" this by using <pagespeed-script ...> instead of <script type=text/psajs ...> but I'm pretty sure this would actually be slower in practice because it thwarts the preload scanner.

jeffkaufman commented 8 years ago

Planning to start adding <link rel=preload> to replace the image-based preload hack. We don't need this yet, because currently the webkit preload scanner does find and load <script type=text/psajs>, but following the chromium bug it looks like they're going to make the preload scanner stop handling these.

vanushwashere commented 4 years ago

Hi, I have the same issue, when we can get the update?