crowell / modpagespeed_tmp

Automatically exported from code.google.com/p/modpagespeed
Apache License 2.0
0 stars 0 forks source link

mod_pagespeed strips custom HTML headers #367

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
A user has this in his .conf file:
   Header set ServerID  "XXXX"
he reports that the ServerID is stripped by mod_pagespeed on serving.  Looking 
at the code I think this is correct.

in apache/mod_instaweb.cc, function rewrite_html, we have this block of code:

  if (!context->sent_headers()) {
    ResponseHeaders* headers = context->response_headers();
    apr_table_clear(request->headers_out);
    AddResponseHeadersToRequest(*headers, request);
    headers->Clear();
    context->set_sent_headers(true);
  }

That call to apr_table_clear is the culprit.  I think instead we need to 
rewrite the existing headers, replacing only the ones that we need to override.

Original issue reported on code.google.com by jmara...@google.com on 4 Jan 2012 at 3:57

GoogleCodeExporter commented 9 years ago

Original comment by jmara...@google.com on 4 Jan 2012 at 3:57

GoogleCodeExporter commented 9 years ago
On Wed, Jan 4, 2012 at 11:06 AM, Jimy Johny <jimyjohny@gmail.com> wrote:
  The wordpress or htaccess hasn't got any header information. Its directly
  added to the virtualhost configuration. I was also in the assumption that
  mod_headers runs after mod_pagespeed.  In fact it works for images and css
  while it fails for html.

  Is this a simple issue which can be fixed very soon ?  or is their any way
  where I could change the order of apache module processing? 

You can't change the order except by source-code changes.  If you are willing 
to edit C++ code and build yourself, you can do something right now as 
suggested in the thread & I could help you.

If you can update from trunk we could probably fix it this week, assuming we 
can reproduce it.

If you need a binary release it will take longer, as we don't have a release 
scheduled right now.

Original comment by jmara...@google.com on 4 Jan 2012 at 4:25

GoogleCodeExporter commented 9 years ago
I can update from trunk and test it if its available by next week. At the same 
time, I love to edit the code and try on myside if you could provide the help. 
Just let me know what change needs to be applied.

Original comment by jimyjo...@gmail.com on 4 Jan 2012 at 4:33

GoogleCodeExporter commented 9 years ago
Please try commenting out the call to apr_table_clear in 
net/instaweb/apache/mod_instaweb.cc, function rewrite_html().

Original comment by jmara...@google.com on 4 Jan 2012 at 4:41

GoogleCodeExporter commented 9 years ago
I tried this, but didn't fixed the issue. Duplicate headers were added.

Date    Wed, 04 Jan 2012 17:13:31 GMT
Server  Apache
X-Powered-By    PHP/5.2.17, PHP/5.2.17
X-Pingback  http://www.xxxx.com/xmlrpc.php, http://www.xxxx.com/xmlrpc.php
Content-Type    text/html; charset=UTF-8
X-Mod-Pagespeed 0.10.19.5-1298, 0.10.19.5-1298
Cache-Control   max-age=0, no-cache, no-store, max-age=0, no-cache, no-store
Vary    Accept-Encoding
Content-Encoding    gzip
Keep-Alive  timeout=1, max=100
Connection  Keep-Alive
Transfer-Encoding   chunked

Original comment by jimyjo...@gmail.com on 4 Jan 2012 at 5:17

GoogleCodeExporter commented 9 years ago
Thanks for giving this a try.  I can certainly understand why we are doubling 
some of the resources, but I can't think of why we are losing the ServerID.  We 
certainly don't look for that particular header.

I have another thing for you to try.  Go ahead & replace the apr_table_clear.  
I don't think that was the problem after all.  Instead try commenting out the 
entire contents of the function DisableDownstreamHeaderFilters in 
net/instaweb/apache/header_util.cc.

This is more likely the culprit.  The reason that we have this call is that we 
are trying to automatically set the cache lifetimes of mod_pagespeed-rewritten 
HTML & resources and we were running into trouble with configuration files 
overriding our settings, eliminating some of the benefit of using mod_pagespeed.

I'm not entirely sure how to work around this in a way that allows 
non-caching-related headers to be added to HTML resources from the .conf file.  
We might need to do this using a mod_pagespeed option to leave HTML headers 
alone.

We happen to be working on such an option already.  But in the meantime please 
give this a try.

Original comment by jmara...@google.com on 4 Jan 2012 at 6:07

GoogleCodeExporter commented 9 years ago
Issue Fixed by commenting out DisableDownstreamHeaderFilters . Is this going to 
reduce the performance ?

Original comment by jimyjo...@gmail.com on 5 Jan 2012 at 4:08

GoogleCodeExporter commented 9 years ago
The benefit provided by DisableDownstreamHeaderFilters is that it allows 
mod_pagespeed to determine caching headers for rewritten files, overriding the 
settings added using "Header Set" and "Expires".  Those settings are needed to 
allow the site owner to control the cache lifetime of origin resources, but we 
don't want them to apply to rewritten resources.

So when you comment out this function, mod_pagespeed's caching control might 
not work.

The proper fix for this is, I think, for mod_pagespeed to offer a 
pagespeed.conf setting which will apply only to HTML files.  If you turn this 
new feature on, mod_pagespeed will not enforce any particular caching policy 
for HTML, and the responsibility for setting appropriate HTML caching headers 
will lie with the site owner.  Because of the way mod_pagespeed extends cache 
lifetime for resources using a content hash in a resource URL, a long HTML 
cache lifetime could result in stale resources (typically CSS files).  This is 
why mod_pagespeed currently takes a conservative approach and disables caching 
of HTML -- to avoid stale cache-extended resources.

Hope that made sense!

Original comment by jmara...@google.com on 5 Jan 2012 at 4:51

GoogleCodeExporter commented 9 years ago
That sounds good. maybe I should wait for the proper fix.

Original comment by jimyjo...@gmail.com on 5 Jan 2012 at 5:41

GoogleCodeExporter commented 9 years ago
I have another thought about a proper fix that might work without requiring a 
new mod_pagespeed option.

For HTML, rather than removing the mod_headers and mod_expires filters, we 
should install a very late fixup handler to put in the caching semantics we've 
computed for HTML.

For resources, we can leave the existing flow as is because we are capturing 
any custom headers when we fetch the origin resource.

Original comment by jmara...@google.com on 6 Jan 2012 at 2:52

GoogleCodeExporter commented 9 years ago

Original comment by jmara...@google.com on 6 Jan 2012 at 6:07

GoogleCodeExporter commented 9 years ago
OK that last idea worked, and is now committed to trunk.  If you prefer to work 
from a release branch, you can try to patch in the changes attached to the bug 
report.

Original comment by jmara...@google.com on 6 Jan 2012 at 9:48

Attachments:

GoogleCodeExporter commented 9 years ago
Thank you for the fix.

I'll try this. In the meantime, I tried something different. After disabling 
the function, I noticed that its able to add Cache-Control headers also. As it 
worked like that, I tried to host the html via CDN which will cache for 5 min. 
So after 5 min, it will come back again to my server and can complete the 
pagespeed processing. Is it possible that I get same functionality with 
Cache-Control header after I apply this patch ?

Original comment by jimyjo...@gmail.com on 7 Jan 2012 at 2:39

GoogleCodeExporter commented 9 years ago
The latest release (0.10.21.2) added a new directive 
ModPagespeedModifyCachingHeaders.
http://code.google.com/speed/page-speed/docs/install.html#ModifyCachingHeaders

Would this address your requirements?

Original comment by matterb...@google.com on 2 Mar 2012 at 3:37

GoogleCodeExporter commented 9 years ago
It provides solution to an extend. I'm using a proxy as frontend. So it seems I 
have to set the cache header in all backend servers when the new option is 
enabled. The fix which applied earlier by editing the code allowed me to add 
the header in frontend server where pagespeed is installed.

Original comment by jimyjo...@gmail.com on 2 Mar 2012 at 4:52

GoogleCodeExporter commented 9 years ago
One final note:  There is a new option in the most recent binary release that 
provides the behavior you want: explicitly control of your HTML caching headers:
   ModPagespeedModifyCachingHeaders off
http://code.google.com/speed/page-speed/docs/install.html#ModifyCachingHeaders

Original comment by jmara...@google.com on 2 Mar 2012 at 10:21

GoogleCodeExporter commented 9 years ago

Original comment by jmara...@google.com on 22 May 2012 at 7:59

GoogleCodeExporter commented 9 years ago
Seems like this issue is back again with the latest release (0.10.22.4-1648). I 
tried to insert a custom header via proxy, but didn't worked. This was working  
when disabling DisableDownstreamHeaderFilters in older releases ( 
0.10.21.2-1527 ). But, in the new release, this fix isn't working at all.

Original comment by jimyjo...@gmail.com on 28 Aug 2012 at 4:31

GoogleCodeExporter commented 9 years ago
Hi -- and sorry for the re-occurrence of this.  Actually I think we never 
really fixed it for you.  Your fix to comment out 
DisableDownstreamHeaderFilters was not incorporated in the new option 
"ModPagespeedModifyCachingHeaders".  Is that what you tried with 0.10.22.4?

Does it still work for you if you again comment out 
DisableDownstreamHeaderFilters?

Original comment by jmara...@google.com on 28 Aug 2012 at 5:19

GoogleCodeExporter commented 9 years ago
Also, could you please try the latest trunk version? It's @1815 at the moment.

Original comment by matterb...@google.com on 28 Aug 2012 at 5:21

GoogleCodeExporter commented 9 years ago
I'm testing that now. But when I tried to recompile I got the following error 
message. "make: *** No rule to make target `linux_package_rpm'.  Stop." . any 
idea what could have stopped me from compiling ?

Original comment by jimyjo...@gmail.com on 28 Aug 2012 at 5:25

GoogleCodeExporter commented 9 years ago
Seems like svn checkout didn't completed and that lead to the compilation issue.

Syncing projects:  79% (43/54) 
________ running 'svn checkout 
http://code.opencv.org/svn/opencv/tags/2.3.1/opencv/include@head 
/root/mod_pagespeed/src/third_party/opencv/src/opencv/include --revision head 
--force --ignore-externals' in '/root/mod_pagespeed'
svn: Could not open the requested SVN filesystem
Error: Command svn checkout 
http://code.opencv.org/svn/opencv/tags/2.3.1/opencv/include@head 
/root/mod_pagespeed/src/third_party/opencv/src/opencv/include --revision head 
--force --ignore-externals returned non-zero exit status 1 in 
/root/mod_pagespeed

Original comment by jimyjo...@gmail.com on 28 Aug 2012 at 6:09

GoogleCodeExporter commented 9 years ago
Please use branches/22 rather than tags/2.3.1

Original comment by matterb...@google.com on 28 Aug 2012 at 6:22

GoogleCodeExporter commented 9 years ago
:( ... opencv svn repository not available any more as per 
http://code.opencv.org. They migrate to git. So it looks like, pagespeed 
version 0.10.22.4 and lower cannot be used from the source code any more. I'm 
going to try the latest trunk.

Original comment by jimyjo...@gmail.com on 28 Aug 2012 at 6:23

GoogleCodeExporter commented 9 years ago
branches/22 has issue.  checkout breaks after 50%

The code from the trunk works. is it the latest one ? the version which I got 
is 0.10.0.0-1831 . So I was not able to upgrade the existing version as it says 
I'm already using a newer version. So I emoved the pagespeed version 0.10.22.4 
rpm and installed 0.10.0.0-1831 rpm.

After commenting DisableDownstreamHeaderFilters in code and disabling 
ModPagespeedModifyCachingHeaders in configuration file, the custom header via 
proxy is working again.

Original comment by jimyjo...@gmail.com on 28 Aug 2012 at 7:04

GoogleCodeExporter commented 9 years ago
Good news, but did you try it without commenting out 
DisableDownstreamHeaderFilters?
If not, please do and let us know.

Original comment by matterb...@google.com on 28 Aug 2012 at 7:18

GoogleCodeExporter commented 9 years ago
Without commenting DisableDownstreamHeaderFilters and disabling 
ModPagespeedModifyCachingHeaders, the custom headers works when sent via 
backend server. But, the proxy (frontend where pagespeed is installed) is not 
able to rewrite those headers or to insert any new custom headers.

Original comment by jimyjo...@gmail.com on 29 Aug 2012 at 2:05

GoogleCodeExporter commented 9 years ago
Let me restate what I think is the current situation:
* With the latest source:
  a) If you comment out DisableDownstreamHeaderFilters (DDHF), the proxy rewrites/inserts custom headers successfully.
  b) If you leave DDHF alone, but set ModPagespeedModifyCachingHeaders to off in your pagespeed.conf, the proxy does not rewrite/insert custom headers.

If this is the case, then something very strange is going on, because 
'ModPagespeedModifyCachingHeaders off' prevents DDHF from being added to the 
output filter chain so should be EXACTLY the same as commenting it out.

Original comment by matterb...@google.com on 29 Aug 2012 at 1:41

GoogleCodeExporter commented 9 years ago
Yes, what you said is correct. But, adding little more info regarding case b.

 b1) ModPagespeedModifyCachingHeaders off
 b2) Backend servers already sending a cache header

 Result - Not able to replace the cache header using proxy

Another Test case:
 b1) ModPagespeedModifyCachingHeaders off
 b2) No cache header sent by backend servers

 Result - Proxy is able to insert the header.

Original comment by jimyjo...@gmail.com on 29 Aug 2012 at 4:40

GoogleCodeExporter commented 9 years ago
I'd like to bump this issue.

We have five cookies that we're setting in the HEADER and four of them 
disappear when ModPagespeed is turned on.

We've tried ModPagespeedModifyCachingHeaders off but that doesn't fix our issue.

Original comment by alexel...@gmail.com on 18 Oct 2012 at 11:09

GoogleCodeExporter commented 9 years ago
To clarify, the 5 cookies you are setting in the header are on resources?  Or 
on HTML?

Exactly how are you setting them?

Original comment by jmara...@google.com on 18 Oct 2012 at 11:15

GoogleCodeExporter commented 9 years ago
The 4 cookies that are disappearing in the HEADER are being created via one of 
our web applications running in Tomcat, a java servlet container. These cookies 
are set after the user's initial GET request. The fifth cookie, the one that 
still shows up in the header is being set via javascript.

Original comment by alexel...@gmail.com on 18 Oct 2012 at 11:26

GoogleCodeExporter commented 9 years ago
Observe the effect of turning ModPagespeed off

$> curl -s -i 
"http://www.qasite.com?ModPagespeed=on&ModPagespeedModifyCachingHeaders=off" | 
head -20
HTTP/1.1 200 OK
Date: Thu, 18 Oct 2012 23:15:17 GMT
Server: Apache
Vary: User-Agent,Accept-Encoding
Transfer-Encoding: chunked
Content-Type: text/html;charset=UTF-8
Set-Cookie: NSC_vt_rb=ffffffff0909099545525d5f4f58455e445a4a423660;expires=Thu, 
18-Oct-2012 23:17:17 GMT;path=/;httponly

$> curl -s -i 
"http://www.qasite.com?ModPagespeed=off&ModPagespeedModifyCachingHeaders=off" | 
head -20
HTTP/1.1 200 OK
Date: Thu, 18 Oct 2012 23:17:25 GMT
Server: Apache
Set-Cookie: CTK=179r2lhdsXXXi0i6; Expires=Tue, 23-Oct-2029 18:05:56 GMT; Path=/
Set-Cookie: DCT=1; Expires=Fri, 19-Oct-2012 01:17:25 GMT; Path=/
Set-Cookie: JSESSIONID=CC58E49B1834350B5FCB8D9F3AD06879.us-qa_tst-server1; 
Path=/
Set-Cookie: QA_CSRF_TOKEN=xeHK2e7NFPHhK2kMYy7Bth8hbbFWnpFY; Path=/
Content-Language: en-US
Vary: User-Agent,Accept-Encoding
Transfer-Encoding: chunked
Content-Type: text/html;charset=UTF-8
Set-Cookie: NSC_vt_rb=ffffffff0909099245525d5f4f58466e445a4a423660;expires=Thu, 
18-Oct-2012 23:19:25 GMT;path=/;httponly

Original comment by alexel...@gmail.com on 18 Oct 2012 at 11:30

GoogleCodeExporter commented 9 years ago
ModPagespeedMoidfyCachingHeaders is also set to off in the pagespeed.conf file, 
just to be double sure.

Original comment by alexel...@gmail.com on 18 Oct 2012 at 11:31

GoogleCodeExporter commented 9 years ago
Apache is a reverse proxy for tomcat in this site?  Tomcat puts these into the 
response headers sent to apache?

Original comment by jmara...@gmail.com on 18 Oct 2012 at 11:45

GoogleCodeExporter commented 9 years ago
Oh, as shown above, it also doesn't look like it's passing Content-Language 
either

Original comment by alexel...@gmail.com on 18 Oct 2012 at 11:46

GoogleCodeExporter commented 9 years ago
Correct, we use apache for load balancing between multiple instances of tomcat.

Original comment by alexel...@gmail.com on 18 Oct 2012 at 11:47

GoogleCodeExporter commented 9 years ago
If you set LogLevel debug in your Apache httpd.conf (and restart) you should 
see messages like these in logs/error_log:

Fetching  resource http://localhost:8080/mod_pagespeed_test/...
http://localhost:8080/mod_pagespeed_test/...: Locking (lock ...)
Initiating async fetch for http://localhost:8080/mod_pagespeed_test/...
mod_headers.c(756): headers: ap_headers_output_filter()
http://localhost:8080/mod_pagespeed_test/...: Unlocking lock ...
Fetch complete: http://localhost:8080/mod_pagespeed_test/...
Rewrite http://localhost:8080/mod_pagespeed_test/... failed while fetching 
http://localhost:8080/mod_pagespeed_test/...
Fetch succeeded for http://localhost:8080/mod_pagespeed_test/..., status=200

(although I'd expect the rewrite to succeed not fail).

Could you please:
* Stop Apache and enable logging.
* Clear your pagespeed cache (e.g. rm -rf /usr/local/apache2/pagespeed_cache/*).
* Restart Apache.
* Fetch the resource such that the cookies are lost (so with MPS on but filters 
as you wish).
* Manually fetch the original URL - the one in the log line that starts with 
"Initiating async fetch ...".
  Please be sure to capture the headers for this fetch.
* Extract the relevant messages from the Apache log
  (they should start with same first line and all have the same timestamp).
* Redact them as necessary.
* Post them here.
* Post the result of the manual fetch of the original URL (redacted as 
necessary).

Hopefully this will provide us with information with which we can debug further.

Thanks, m.

Original comment by matterb...@google.com on 19 Oct 2012 at 6:27

GoogleCodeExporter commented 9 years ago
So this is the log output in error_log

[Fri Oct 19 17:29:36 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] No 
permission to rewrite 
'http://statics.s3.amazonaws.com/s/84f9644b/archiver_all.css'
[Fri Oct 19 17:29:36 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] No 
permission to rewrite 
'http://statics.s3.amazonaws.com/s/84f9644b/archiver_all.css'
[Fri Oct 19 17:29:36 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] 
http://www.qasite.net/your_query.html:78: Found script with src 
http://statics.s3.amazonaws.com/s/ebcdc76/archiver-all-compiled.js
[Fri Oct 19 17:29:36 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] No 
permission to rewrite 
'http://statics.s3.amazonaws.com/s/ebcdc76/archiver-all-compiled.js'
[Fri Oct 19 17:29:36 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] No 
permission to rewrite 
'http://statics.s3.amazonaws.com/s/84f9644b/archiver_all.css'
[Fri Oct 19 17:29:36 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] No 
permission to rewrite 
'http://statics.s3.amazonaws.com/s/ebcdc76/archiver-all-compiled.js'
[Fri Oct 19 17:29:36 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] No 
permission to rewrite 
'http://statics.s3.amazonaws.com/s/84f9644b/archiver_all.css'
[Fri Oct 19 17:29:36 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] No 
permission to rewrite 
'http://statics.s3.amazonaws.com/s/ebcdc76/archiver-all-compiled.js'
[Fri Oct 19 17:29:36 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] No 
permission to rewrite 
'http://statics.s3.amazonaws.com/s/84f9644b/archiver_all.css'
[Fri Oct 19 17:29:37 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] No 
permission to rewrite 
'http://statics.s3.amazonaws.com/s/84f9644b/archiver_all.css'
[Fri Oct 19 17:29:37 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] 
http://www.qasite.net/your_query.html:13: Found script with src 
http://statics.s3.amazonaws.com/s/ebcdc76/archiver-all-compiled.js
[Fri Oct 19 17:29:37 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] No 
permission to rewrite 
'http://statics.s3.amazonaws.com/s/ebcdc76/archiver-all-compiled.js'
[Fri Oct 19 17:29:37 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] No 
permission to rewrite 
'http://statics.s3.amazonaws.com/s/84f9644b/archiver_all.css'
[Fri Oct 19 17:29:37 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] No 
permission to rewrite 
'http://statics.s3.amazonaws.com/s/ebcdc76/archiver-all-compiled.js'
[Fri Oct 19 17:29:37 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] No 
permission to rewrite 
'http://statics.s3.amazonaws.com/s/84f9644b/archiver_all.css'
[Fri Oct 19 17:29:37 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] No 
permission to rewrite 
'http://statics.s3.amazonaws.com/s/16fhccb99/archiver-all-compiled.js'
[Fri Oct 19 17:29:37 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] Starting 
to rewrite images in CSS in http://www.qasite.net/your_query.html
[Fri Oct 19 17:29:37 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] Found 
image URL /images/clear.png
[Fri Oct 19 17:29:37 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] 
http://www.qasite.net/your_query.html:211: CSS parsing error in 
http://www.qasite.net/your_query.html
[Fri Oct 19 17:29:37 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] Starting 
to rewrite images in CSS in http://www.qasite.net/your_query.html
[Fri Oct 19 17:29:37 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] 
http://www.qasite.net/your_query.html:498: Successfully rewrote CSS file 
http://www.qasite.net/your_query.html saving 219 bytes.
[Fri Oct 19 17:29:37 2012] [warn] [mod_pagespeed 1.0.22.7-2005 @16445] Failed 
to read cache clean timestamp /var/www/mod_pagespeed/cache/!clean!time!.  Doing 
an extra cache clean to be safe.
[Fri Oct 19 17:29:37 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] Need to 
check cache size against target 104857600
[Fri Oct 19 17:29:37 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] 
http://www.qasite.net/images/clear.png: Locking (lock xyugYzPChM3NH97H60Db.lock)
[Fri Oct 19 17:29:37 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] 
Initiating async fetch for http://127.0.0.1/images/clear.png
[Fri Oct 19 17:29:37 2012] [warn] [mod_pagespeed 1.0.22.7-2005 @16445] Failed 
to read cache clean timestamp /var/www/mod_pagespeed/cache/!clean!time!.  Doing 
an extra cache clean to be safe.
[Fri Oct 19 17:29:37 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] Need to 
check cache size against target 104857600
[Fri Oct 19 17:29:37 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] 
http://www.qasite.net/images/clear.png: Unlocking lock 
/var/www/mod_pagespeed/cache/xyugYzPChM3NH97H60Db.lock with cached=true, 
success=true
[Fri Oct 19 17:29:37 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] Fetch 
complete: http://127.0.0.1/images/clear.png
[Fri Oct 19 17:29:37 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] Checking 
cache size against target 104857600
[Fri Oct 19 17:29:37 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] File 
cache size is 19679; no cleanup needed.
[Fri Oct 19 17:29:37 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] 
http://www.qasite.net/your_query.html:191: Successfully rewrote CSS file 
http://www.qasite.net/your_query.html saving 11 bytes.
[Fri Oct 19 17:29:37 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] Starting 
to rewrite images in CSS in http://www.qasite.net/your_query.html
[Fri Oct 19 17:29:37 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] Found 
image URL /images/clear.png
[Fri Oct 19 17:29:37 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] Starting 
to rewrite images in CSS in http://www.qasite.net/your_query.html
[Fri Oct 19 17:29:37 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] 
http://www.qasite.net/your_query.html:47: CSS parser increased size of CSS file 
http://www.qasite.net/your_query.html by 0 bytes.
[Fri Oct 19 17:29:37 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] Starting 
to rewrite images in CSS in http://www.qasite.net/your_query.html
[Fri Oct 19 17:29:37 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] 
http://www.qasite.net/your_query.html:52: CSS parser increased size of CSS file 
http://www.qasite.net/your_query.html by 0 bytes.
[Fri Oct 19 17:29:37 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] No 
permission to rewrite 'http://statics.s3.amazonaws.com/images/free.png'
[Fri Oct 19 17:29:37 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] No 
permission to rewrite 'http://statics.s3.amazonaws.com/images/free.png'
[Fri Oct 19 17:29:37 2012] [info] [mod_pagespeed 1.0.22.7-2005 @16445] 
http://www.qasite.net/your_query.html:30: Successfully rewrote CSS file 
http://www.qasite.net/your_query.html saving 11 bytes.
[Fri Oct 19 17:29:48 2012] [debug] net/instaweb/apache/mod_instaweb.cc(385): 
[client 127.0.0.1] ModPagespeed OutputFilter called for request 
/server-status?auto
[Fri Oct 19 17:29:48 2012] [debug] net/instaweb/apache/mod_instaweb.cc(217): 
[client 127.0.0.1] Request not rewritten because: request->content_type does 
not appear to be HTML (was text/plain; charset=ISO-8859-1)

The initial resource request headers looked like

$> curl -s -i http://www.qasite.net/your_request | head -20
HTTP/1.1 200 OK
Date: Fri, 19 Oct 2012 22:29:36 GMT
Server: Apache
Vary: Accept-Encoding,User-Agent
Transfer-Encoding: chunked
Content-Type: text/html;charset=UTF-8
Set-Cookie: 
NSC_vt_rb=ffffffff0909099245525d5f400000005e445a4a423660;expires=Fri, 
19-Oct-2012 22:31:37 GMT;path=/;httponly

but the original URL --- the one captured from the log line has a request like 
this

$> curl -s -i http://127.0.0.1/your_request
HTTP/1.1 301 Moved Permanently
Date: Fri, 19 Oct 2012 22:29:12 GMT
Server: Apache
Location: http://www.qasite.net/your_request
Content-Length: 319
Content-Type: text/html; charset=iso-8859-1

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>301 Moved Permanently</title>
</head><body>
<h1>Moved Permanently</h1>
<p>The document has moved <a 
href="http://www.qasite.net/your_request">here</a>.</p>
<hr>
<address>Apache Server at 127.0.0.1 Port 80</address>
</body></html>

Which as you can see, is a redirect back to the first site. I hope you are able 
to find something relevant in the log output. Let me know if there is anything 
else I can do diagnostically.

Original comment by alexel...@gmail.com on 19 Oct 2012 at 10:47

GoogleCodeExporter commented 9 years ago
I've had some trouble reproducing this bug but nevertheless we think we know 
where the bug is.  The problem is that we capture the headers very early in the 
life of a request, copying them into our own structure.  We modify our own 
structure, and later on, we clear out the apache request headers and replace 
them with our own copy.

This is done here: 
http://code.google.com/p/modpagespeed/source/browse/trunk/src/net/instaweb/apach
e/mod_instaweb.cc#329

I think we should try to reproduce the problem in a test, but we could in any 
case change the strategy to incrementally update the headers with the values in 
our own copy, rather than clearing and replacing.

Original comment by jmara...@google.com on 30 Oct 2012 at 7:22

GoogleCodeExporter commented 9 years ago
We've also been burned by this, and are fairly certain the issue in our case 
that any GET which has a Cache-Control header set will inspire mod_pagespeed to 
disable later header fixups (i.e. explicit Header directives in the apache 
configuration). This is done in the DisableDownstreamHeaderFilters function. 
Our apache server is proxying for a JBoss/Tomcat application and we're setting 
"Cache-Control: no-cache" there. Fundamentally I object to the idea that 
mod_pagespeed "knows better" what headers should be set than I do, especially 
when I'm explicitly setting them in the server configuration. I'm prepared to 
create a patch to introduce a configuration option to, basically, stop 
DisableDownstreamHeaderFilters from doing what it's doing. Would such a patch 
be accepted? Is there a better approach?

Original comment by m...@favoritemo.com on 15 Nov 2012 at 8:29

GoogleCodeExporter commented 9 years ago
I should add that this bug is so severe from our perspective, that we have 
essentially rejected ModPageSpeed's promotion for use in our production 
environment and we have disabled it in our QA environment for the foreseeable 
future. I will be following this ticket and may opt for reconsideration of our 
use of ModPageSpeed if this issue is reliably resolved.

Original comment by alexel...@gmail.com on 15 Nov 2012 at 8:41

GoogleCodeExporter commented 9 years ago
We'd really like to fix this bug but were unable to reproduce it.  Can someone 
publish enough detail on what their setup is that we can see it happening 
ourselves?

In general MPS needs to make caching of HTML more pessimistic than sites have 
originally because we have made caching of resources more aggressive, if that 
makes any sense.

Matt&Alexelman: you are both running 1.1.23.1, is that right, and still seeing 
this?  You both have specified
   ModPagespeedModifyCachingHeaders off
and this still is affecting you?

Original comment by jmara...@google.com on 15 Nov 2012 at 11:00

GoogleCodeExporter commented 9 years ago
Yes, I have specified ModPagespeedModifyCachingHeaders off as mentioned 
previously. This is the version we are using:

mod-pagespeed-stable-1.0.22.7-2005

Original comment by alexel...@gmail.com on 15 Nov 2012 at 11:11

GoogleCodeExporter commented 9 years ago
Can you try 1.1.23.1?  One theory was that we had trouble reproducing because 
we'd fixed the problem in that release already.

Original comment by jmara...@google.com on 15 Nov 2012 at 11:14

GoogleCodeExporter commented 9 years ago
We have experimented with several versions, including the latest as of a
few weeks ago. After we observed that the header behavior was different
depending on whether or not your request is a POST or a GET, we decided to
just look at the source. And if you are getting caught in the Cache-Control
trap like we are, I am absolutely certain that behavior has not changed as
of revision 2191.

I think I'll go ahead and patch it and see if that fixes anyone's problem.
I'm unsure what motivated the creation of DisableDownstreamHeaderFilters --
to me it seems like something you would never want; if users were messing
up pagespeed headers with downstream filters, well, they should just stop
doing that (via configuration) rather than having pagespeed ruin _all_
attempts to set output headers on pages it's rewriting. So if it would make
more sense to remove it, I'm happy to do that instead.

-m

Original comment by m...@favoritemo.com on 15 Nov 2012 at 11:45

GoogleCodeExporter commented 9 years ago
Matt,

I think the fear is that, if a user of MPS has an aggressive caching policy, 
this would essentially render many of the features of MPS useless. The 
inspiration behind DDHF is to circumvent these aggressive caching measures and 
for MPS to institute something more useful.

That being said, DDHF seems to be overly assertive in removing headers for some 
reason that is unclear to me. I have not tried 1.1.23.1 but it seems like it 
might be effective as long as we use ModPagespeedModifyCachingHeaders off

Original comment by alexel...@gmail.com on 16 Nov 2012 at 12:26

GoogleCodeExporter commented 9 years ago
I don't have a great repro yet (which is really essential to me delivering a 
fix with confidence).

But I think I have an idea of what the problem might be; it might be some 
convergence in the helper functions used to enforce 1-year TTL for rewritten 
resources (which we definitely want) and the code to enforce cc:nocache for 
HTML (which should be configurable).

Do all the people suffering from this issue using JBoss/Tomcat?  I don't know 
much about either of those.  If detailed configuration information could be 
provided maybe I could repro.

Original comment by jmara...@google.com on 16 Nov 2012 at 12:36

GoogleCodeExporter commented 9 years ago
We use a both Tomcat 5 and Tomcat 7. We see the same issue on applications 
using both 5 and 7.

Original comment by alexel...@gmail.com on 16 Nov 2012 at 12:54

GoogleCodeExporter commented 9 years ago
Ah to clarify we see issues on applications running on tomcat 5 as well as apps 
running on tomcat 7

Original comment by alexel...@gmail.com on 16 Nov 2012 at 12:54