matomo-org / plugin-QueuedTracking

Scale your large traffic Matomo service by queuing tracking requests (in Redis or MySQL) for better performance.
https://matomo.org
GNU General Public License v3.0
82 stars 35 forks source link

The HTTP Content-Length header being sent is wrong #45

Closed ChristianGfK closed 6 years ago

ChristianGfK commented 8 years ago

The Content-Length header being sent here (C) does not seem to line up with what's actually being sent, resulting in endless messages of this kind on a front-end load balancer (A):

2016/09/28 14:10:57 [warn] 29472#29472: *92625402 upstream (B) sent more data than specified (C) in "Content-Length" header while reading upstream

Removing that line from Response.php (C) makes the backend web server (B) use chunked transfer encoding to send PHP's output to (A) and everything is fine.

I'm not sure if chunked is worse than non-chunked with the correct content length, but I'm fairly sure it's better than sending the wrong content-length.

*) Setup: nginx (A) --> nginx (B) + php-fpm (C), nothing special.

mattab commented 8 years ago

Thanks for the report! A pull request would be very welcome :+1:

ChristianGfK commented 8 years ago

Sadly, I have no idea what that line really does or why low-level stuff like content-length is being handled manually here. I'm not well-informed enough to suggest a proper fix. :zipper_mouth_face:

mattab commented 8 years ago

No problem @ChristianGfK - as we are quite busy and low resourced, we likely won't work on this for a while. if it's a priority for you, feel free to sponsor the development of it by contacting us

MichaelRoosz commented 7 years ago

@ChristianGfK Can you please post such an invalid answer? Knowing what the extra bytes are would help a lot with finding the bug.

MichaelRoosz commented 7 years ago

@ChristianGfK Please check your "output_buffering" php.ini setting (for example with phpinfo()), It should be activated (example: output_buffering=4096). If it is off, this may explain your problem.

ChristianGfK commented 7 years ago

@MichaelHeerklotz I can confirm that output_buffering is set to 4096. I'll see about grabbing such a response, but at the moment I can't remember how I came to that conclusion 8 months ago... :thinking:

ChristianGfK commented 7 years ago

One more artifact I just observed: The workaround (which results in Transfer-Encoding: chunked) I suggested in my initial report makes it clear that the script actually delivers 2 1x1 px GIFs concatenated:

0000   48 54 54 50 2f 31 2e 31 20 32 30 30 20 4f 4b 0d  HTTP/1.1 200 OK.
0010   0a 53 65 72 76 65 72 3a 20 6e 67 69 6e 78 0d 0a  .Server: nginx..
0020   44 61 74 65 3a 20 4d 6f 6e 2c 20 32 39 20 4d 61  Date: Mon, 29 Ma
0030   79 20 32 30 31 37 20 31 35 3a 32 39 3a 34 37 20  y 2017 15:29:47 
0040   47 4d 54 0d 0a 43 6f 6e 74 65 6e 74 2d 54 79 70  GMT..Content-Typ
0050   65 3a 20 69 6d 61 67 65 2f 67 69 66 0d 0a 54 72  e: image/gif..Tr
0060   61 6e 73 66 65 72 2d 45 6e 63 6f 64 69 6e 67 3a  ansfer-Encoding:
0070   20 63 68 75 6e 6b 65 64 0d 0a 43 6f 6e 6e 65 63   chunked..Connec
0080   74 69 6f 6e 3a 20 63 6c 6f 73 65 0d 0a 43 6f 6e  tion: close..Con
0090   74 65 6e 74 2d 45 6e 63 6f 64 69 6e 67 3a 20 6e  tent-Encoding: n
00a0   6f 6e 65 0d 0a 0d 0a 35 36 0d 0a 47 49 46 38 39  one....56..GIF89
00b0   61 01 00 01 00 80 00 00 00 00 00 00 00 00 21 f9  a.............!.
00c0   04 01 00 00 00 00 2c 00 00 00 00 01 00 01 00 00  ......,.........
00d0   02 02 44 01 00 3b 47 49 46 38 39 61 01 00 01 00  ..D..;GIF89a....
00e0   80 00 00 00 00 00 00 00 00 21 f9 04 01 00 00 00  .........!......
00f0   00 2c 00 00 00 00 01 00 01 00 00 02 02 44 01 00  .,...........D..
0100   3b 0d 0a 30 0d 0a 0d 0a                          ;..0....

In regular mode (i.e. with Content-Length header), this leads to the observed mismatch. It reports Content-Length for 1 GIF, but actually delivers 2 GIFs.

frontend access log
[29/May/2017:18:49:34 +0200] 1.2.3.4 - "GET /stats/piwik.php?action_name=xxx&idsite=1&send_image=1 HTTP/1.1" Status: 200 - Time: 0.069 - Bytes: 86

frontend error log
2017/05/29 18:49:34 [warn] 26697#26697: *10777027 upstream sent more data than specified in "Content-Length" header while reading upstream, client: 1.2.3.4, server: www.xxx.com, request
: "GET /stats/piwik.php?action_name=xxx&idsite=1&send_image=1 HTTP/1.1", upstream: "http://5.6.7.8:80/stats/piwik.php?action_name=xxx&idsite=1&send_image=1", host: "www.xxx.com"

backend access log
[29/May/2017:18:49:34 +0200] 213.144.18.1 - "GET /stats/piwik.php?action_name=xxx&idsite=1&send_image=1 HTTP/1.1" Status: 200 - Time: 0.068 - Bytes: 86

0000   48 54 54 50 2f 31 2e 31 20 32 30 30 20 4f 4b 0d  HTTP/1.1 200 OK.
0010   0a 53 65 72 76 65 72 3a 20 6e 67 69 6e 78 0d 0a  .Server: nginx..
0020   44 61 74 65 3a 20 4d 6f 6e 2c 20 32 39 20 4d 61  Date: Mon, 29 Ma
0030   79 20 32 30 31 37 20 31 36 3a 35 35 3a 31 33 20  y 2017 16:55:13 
0040   47 4d 54 0d 0a 43 6f 6e 74 65 6e 74 2d 54 79 70  GMT..Content-Typ
0050   65 3a 20 69 6d 61 67 65 2f 67 69 66 0d 0a 43 6f  e: image/gif..Co
0060   6e 74 65 6e 74 2d 4c 65 6e 67 74 68 3a 20 34 33  ntent-Length: 43
0070   0d 0a 43 6f 6e 6e 65 63 74 69 6f 6e 3a 20 63 6c  ..Connection: cl
0080   6f 73 65 0d 0a 43 6f 6e 74 65 6e 74 2d 45 6e 63  ose..Content-Enc
0090   6f 64 69 6e 67 3a 20 6e 6f 6e 65 0d 0a 0d 0a 47  oding: none....G
00a0   49 46 38 39 61 01 00 01 00 80 00 00 00 00 00 00  IF89a...........
00b0   00 00 21 f9 04 01 00 00 00 00 2c 00 00 00 00 01  ..!.......,.....
00c0   00 01 00 00 02 02 44 01 00 3b 47 49 46 38 39 61  ......D..;GIF89a
00d0   01 00 01 00 80 00 00 00 00 00 00 00 00 21 f9 04  .............!..
00e0   01 00 00 00 00 2c 00 00 00 00 01 00 01 00 00 02  .....,..........
00f0   02 44 01 00 3b                                   .D..;

So the Content-Length header would be correct, if the script wasn't delivering 2 GIFs instead of just 1! Weird!

tsteur commented 7 years ago

I tried to reproduce this but couldn't. output_buffering is set to 4096 as well

ChristianGfK commented 7 years ago

Please keep in mind that this report was initially filed against version 2.16.2 (current at the time). My most recent comment (the one showing it's sending 2 GIFs instead of 1) was based on 2.17.1. I can't easily upgrade to 3.x, so I can't confirm if it still happens there. Considering the source line I linked to in my initial report doesn't seem to be dealing with GIFs at all, I'm inclined to guess that the issue lies somewhere else and is just triggered by this.

I'll make sure to check back in when I can make the necessary changes to upgrade to 3.x. :date:

MichaelRoosz commented 6 years ago

I think I found the problem. When I ran the QueuedTracking tests without curl (so PHP http streams were used), it failed because of the gif getting sent twice.

So I took a look at the call stack for "Response->outputResponse()":

Piwik\Plugins\QueuedTracking\Tracker\Handler->sendResponseNow() ->$response->outputResponse($tracker);

Piwik\Tracker\Handler->finish() ->$this->response->outputResponse($tracker);

So it seems that when "Process during tracking request" is enabled, the gif is sent twice. To avoid this I created a core patch: https://github.com/piwik/piwik/pull/12252

ChristianGfK commented 6 years ago

Awesome! 👍