macbre / phantomas

Headless Chromium-based web performance metrics collector and monitoring tool
https://www.npmjs.com/package/phantomas
BSD 2-Clause "Simplified" License
2.25k stars 141 forks source link

GZIP compression and response size reporting issues #137

Open macbre opened 10 years ago

macbre commented 10 years ago

See https://github.com/ariya/phantomjs/issues/10156 and https://github.com/ariya/phantomjs/issues/10169

macbre commented 10 years ago

HTML:

{
"url":"http://elecena.pl/",
"method":"GET"
"bodySize":10734,
"Content-Encoding":"gzip"
}
$ curl "http://elecena.pl/" --compress | wc -c
10734

JS:

 {
"url":"http://cdn.macbre.net/elecena/r517/package/bootstrap,elecena,ui,suggest,ads,home.js",
"method":"GET",
"bodySize":988
}
$ curl "http://cdn.macbre.net/elecena/r517/package/bootstrap,elecena,ui,suggest,ads,home.js" --compress  | wc -c
29300

Image (the same domain):

{
"url":"http://beta.elecena/r517/skins/elecena/img/header.jpg",
"method":"GET",
"bodySize":9745
}
$ curl "http://beta.elecena/r517/skins/elecena/img/header.jpg" --compress  | wc -c
9745

Image (different domain):

{
"url":"http://cdn.macbre.net/elecena/r517/skins/elecena/img/header.jpg",
"method":"GET",
"bodySize":1044
macbre commented 10 years ago

Use HTTP proxy for running phantomas? https://github.com/nodejitsu/node-http-proxy/tree/caronte

patramsey commented 10 years ago

I like the idea of using the http proxy to get the correct bodySize when the response gzipped. Does the response object in the proxy have that information?

macbre commented 10 years ago

I don't think so. This would need to be a custom proxy that will add that kind of information.

gurdenbatra commented 10 years ago

Hey @macbre, just wanted to check in with the progress on this issue? Can I offer you my help?

macbre commented 10 years ago

No progress on this one, unfortunately. Using proxy in this case would be a great example of overengineering :) Given the fact that PhantomJS v2.0 is just around the corner.

@gurdenbatra, do you have a different solution on how to fix this issue?

gurdenbatra commented 10 years ago

@macbre Do you know if I can at least accurately measure the differential of bodySize? As in, not look at the accuracy of the value but just monitor the rate of change?

I do not have a solution in mind as the problem in hand is not completely clear to me. I'd love to help if you could point me in the right direction. Thanks.

macbre commented 10 years ago

@gurdenbatra, unfortunately, as my tests above show bodySize is not deterministic at all :)

macbre commented 10 years ago

https://github.com/ariya/phantomjs/issues?milestone=12&state=open - PhantomJS v2.0 is 15 days behind the deadline. I'm considering a workaround in phantomas... Something like a "light" HTTP proxy that will add X-Phantomas-... headers with required data (content encoding, content length, uncompressed size, etc.)

macbre commented 10 years ago

Will re-check when #313 is implemented

macbre commented 10 years ago

Still valid (v1.3.0):

phantomas "http://elecena.pl/" -v --block-domain googleapis.com,google-analytics.com,googletagservices.com --runs 5
.-----------------------------------------------------------------------------------------------------------.
| Report from 5 run(s) for <http://elecena.pl/> using phantomas v1.3.0                                      |
|-----------------------------------------------------------------------------------------------------------|
|             Metric             |     min      |     max      |   average    |    median    |    stddev    |
|--------------------------------|--------------|--------------|--------------|--------------|--------------|
| requests                       |           10 |           10 |           10 |           10 |            0 |
| gzipRequests                   |            1 |            1 |            1 |            1 |            0 |
| bodySize                       |        30313 |        44493 |      38537.2 |        41656 |      5259.88 |
| contentLength                  |      1430746 |      1433582 |    1432163.8 |      1432164 |       896.82 |
| ajaxRequests                   |            0 |            0 |            0 |            0 |            0 |
| htmlCount                      |            1 |            1 |            1 |            1 |            0 |
| htmlSize                       |        10907 |        10908 |      10907.8 |        10908 |          0.4 |
| cssCount                       |            1 |            1 |            1 |            1 |            0 |
| cssSize                        |         1023 |         3859 |       2724.6 |         2441 |      1061.13 |
| jsCount                        |            1 |            1 |            1 |            1 |            0 |
| jsSize                         |         2427 |         5263 |       3561.4 |         3845 |      1061.13 |
| jsonCount                      |            0 |            0 |            0 |            0 |            0 |
| jsonSize                       |            0 |            0 |            0 |            0 |            0 |
| imageCount                     |            7 |            7 |            7 |            7 |            0 |
| imageSize                      |      1414970 |      1414970 |      1414970 |      1414970 |            0 |

stddev for sizes should be zero

cvan commented 10 years ago

I have solved this on my own project, and oh boy it was a doozie. I will most certainly upstream a patch to phantomas.

macbre commented 10 years ago

Great :+1:

BTW, it works correctly under SlimerJS (introduced in phantomas v1.4.0)

gmetais commented 10 years ago

That's a great news!!

macbre commented 10 years ago

Unfortunately, not much can be done on phantomas side with this issue. A patch from @cvan is kind of hacky :)

Let's keep our fingers crossed for PhantomJS v2.0 (due by September 22, 2014) and use SlimerJS to run phantomas in the meantime.

cvan commented 10 years ago

In my testing, SlimerJS has the same issues

jcleveley-zz commented 9 years ago

Has anyone tested this issue on PhantomJS 2 (preview) ? https://github.com/bprodoehl/phantomjs/releases

macbre commented 9 years ago

@jcleveley, good point. Binaries from GitHub repository you;ve mentioned do not run on my debian box:

./phantomjs: error while loading shared libraries: libicudata.so.50: cannot open shared object file: No such file or directory

Meanwhile I'm compiling the latest master from phantomjs' repository on droplet instance. Stay tuned :)

macbre commented 9 years ago

See #432 for the similar issue - content size reporting for gzipped content

soulgalore commented 9 years ago

hey @macbre , did phantomjs 2 work for you? I haven't tested yet for sitespeed.io, but planning checking it out this weekend or early next week.

macbre commented 9 years ago

@soulgalore, just tested phantomas with PhantomJS2 binaries compiled in Oct 2014 and there's no improvement.

For http://code.jquery.com/jquery-2.1.1.js I'm getting the following data:

{"id":3,"url":"http://code.jquery.com/jquery-2.1.1.js","method":"GET","requestHeaders":{"Accept":"*/*","Referer":"http://localhost:8888/jquery-multiple.html","User-Agent":"phantomas/1.8.0 (PhantomJS/1.9.8; linux x64)"},"sendTime":"2014-12-06T11:04:01.175Z","bodySize":1695,"isBlocked":false,"protocol":"http","domain":"code.jquery.com","recvStartTime":"2014-12-06T11:04:01.494Z","timeToFirstByte":319,"recvEndTime":"2014-12-06T11:04:01.663Z","timeToLastByte":488,"receiveTime":169,"type":"js","headers":{"Date":"Sat, 06 Dec 2014 11:04:01 GMT","Content-Type":"application/x-javascript","Transfer-Encoding":"chunked","Connection":"keep-alive","Last-Modified":"Fri, 24 Oct 2014 00:16:07 GMT","Vary":"Accept-Encoding","ETag":"W/\"54499a47-3c637\"","Expires":"Thu, 31 Dec 2037 23:55:55 GMT","Cache-Control":"max-age=315360000, public","Server":"NetDNA-cache/2.2","X-Cache":"HIT","Content-Encoding":"gzip"},"contentType":"application/x-javascript","isJS":true,"gzip":true,"status":200,"statusText":"OK","contentLength":1695}
{"id":3,"url":"http://code.jquery.com/jquery-2.1.1.js","method":"GET","requestHeaders":{"User-Agent":"phantomas/1.8.0 (PhantomJS/1.9.8; linux x64)","Accept":"*/*","Referer":"http://localhost:8888/jquery-multiple.html"},"sendTime":"2014-12-06T11:05:40.010Z","bodySize":45018,"isBlocked":false,"protocol":"http","domain":"code.jquery.com","recvStartTime":"2014-12-06T11:05:40.241Z","timeToFirstByte":231,"recvEndTime":"2014-12-06T11:05:40.365Z","timeToLastByte":355,"receiveTime":124,"type":"js","headers":{"Date":"Sat, 06 Dec 2014 11:05:40 GMT","Content-Type":"application/x-javascript","Transfer-Encoding":"chunked","Connection":"keep-alive","Last-Modified":"Fri, 24 Oct 2014 00:16:07 GMT","Vary":"Accept-Encoding","ETag":"W/\"54499a47-3c637\"","Expires":"Thu, 31 Dec 2037 23:55:55 GMT","Cache-Control":"max-age=315360000, public","Server":"NetDNA-cache/2.2","X-Cache":"HIT","Content-Encoding":"gzip"},"contentType":"application/x-javascript","isJS":true,"gzip":true,"status":200,"statusText":"OK","contentLength":45018}
{"id":3,"url":"http://code.jquery.com/jquery-2.1.1.js","method":"GET","requestHeaders":{"Host":"code.jquery.com","User-Agent":"phantomas/1.8.0 (SlimerJS/0.9.2; linux x64)","Accept":"*/*","Accept-Language":"en-US,en;q=0.5","Accept-Encoding":"gzip, deflate","Referer":"http://localhost:8888/jquery-multiple.html"},"sendTime":"2014-12-06T11:15:24.026Z","bodySize":247351,"isBlocked":false,"protocol":"http","domain":"code.jquery.com","recvStartTime":"2014-12-06T11:15:24.201Z","timeToFirstByte":175,"recvEndTime":"2014-12-06T11:15:24.366Z","timeToLastByte":340,"receiveTime":165,"type":"js","headers":{"Date":"Sat, 06 Dec 2014 11:15:24 GMT","Content-Type":"application/x-javascript","Transfer-Encoding":"chunked","Connection":"keep-alive","Last-Modified":"Fri, 24 Oct 2014 00:16:07 GMT","Vary":"Accept-Encoding","Etag":"W/\"54499a47-3c637\"","Expires":"Thu, 31 Dec 2037 23:55:55 GMT","Cache-Control":"max-age=315360000, public","Server":"NetDNA-cache/2.2","X-Cache":"HIT","Content-Encoding":"gzip"},"contentType":"application/x-javascript","isJS":true,"gzip":true,"status":200,"statusText":"OK","contentLength":247351}

Chrome reports gzipped content to weight ~88 kB and 242 kB uncompressed. Numbers above are not even near the real values ;)

soulgalore commented 9 years ago

hmm ok, thanks for the bad news :)

fancyoung commented 9 years ago

In PhantomJS v2.0,
I use foreach to search Content-Length in headers to get size instead of bodySize.

Something like:

  function getRealSize(response) {
    var tmp, size;
    for(var i=0; i<response.headers.length; i++) {
      tmp = response.headers[i];
      if(tmp.name == 'Content-Length') {
        return parseInt(tmp.value, 10);
      }   
    }   
  }

  size = getRealSize(response) || response.bodySize;
macbre commented 9 years ago

@fancyoung, PhantomJS v1.x does not provide Content-Length for gzip encoded content. Is it more reliable in PhantomJS 2.0?

fancyoung commented 9 years ago

@macbre I use

page.customHeaders = {
    'Accept-Encoding': 'gzip;q=0'
};

to disable gzip, and it works for me.

djberriman commented 9 years ago

See https://github.com/ariya/phantomjs/issues/10156 patch to ensure Content-Length is correctly returned when it is present. Still have to disable gzip as Content-Length header appears to be lost by QT.

macbre commented 8 years ago

No good news here:

phantomas using PhantomJS/2.1.1:

15:07:47.826 contentLength missing: {"url":"http://cdn.macbre.net/elecena/r535/package/fonts,bootstrap,elecena,ui,suggest,home.css","bodySize":13463}

Offenders for gzipRequests (3):
 * http://elecena.pl/ (gzip: 7.15 kB / uncompressed: 7.15 kB)
 * http://cdn.macbre.net/elecena/r535/package/fonts,bootstrap,elecena,ui,suggest,home.css (gzip: 13.15 kB / uncompressed: 13.15 kB)
 * http://cdn.macbre.net/elecena/r535/package/bootstrap,elecena,ui,suggest,home.js (gzip: 4.81 kB / uncompressed: 4.81 kB)

curl:

$ curl 'http://cdn.macbre.net/elecena/r535/package/fonts,bootstrap,elecena,ui,suggest,home.css' --compress -svo /dev/null 2>&1 | grep Content
< Content-Encoding: gzip
< Content-Length: 30702

$ curl 'http://cdn.macbre.net/elecena/r535/package/fonts,bootstrap,elecena,ui,suggest,home.css' -svo /dev/null 2>&1 | grep Content
< Content-Length: 149586

PhantomJS stats are still not reliable.

However, when using SlimerJS we do get correct stats:

Offenders for gzipRequests (3):
 * http://elecena.pl/ (gzip: 11.61 kB / uncompressed: 11.61 kB)
 * http://cdn.macbre.net/elecena/r535/package/bootstrap,elecena,ui,suggest,home.js (gzip: 9.48 kB / uncompressed: 28.24 kB)
 * http://cdn.macbre.net/elecena/r535/package/fonts,bootstrap,elecena,ui,suggest,home.css (gzip: 29.98 kB / uncompressed: 146.08 kB)
macbre commented 8 years ago

See #614