apache / incubator-pagespeed-ngx

Automatic PageSpeed optimization module for Nginx
http://ngxpagespeed.com/
Apache License 2.0
4.37k stars 363 forks source link

Combined css returns truncated, mis-encoded body on cache hit #1556

Open jstangroome opened 6 years ago

jstangroome commented 6 years ago

PageSpeed 1.13.35.2 on nginx 1.13.8 on Ubuntu 14.04 (and on Minikube).

Original HTML contains:

<head>
<link rel="stylesheet" href="/css/store.css" type="text/css" />
<link rel="stylesheet" href="/css/vendor.css" type="text/css" />

Rewritten HTML contains:

<head>
<link rel="stylesheet" href="/css/A.store.css+vendor.css,Mcc.HBFc8IpUqh.css.pagespeed.cf.OBCdvpW_8K.css" type="text/css"/>

The first request for /css/A.store.css+vendor.css,Mcc.HBFc8IpUqh.css.pagespeed.cf.OBCdvpW_8K.css (ie cache miss) with Accept-Encoding: gzip correctly returns the a gzipped response body with the combined contents of the two CSS files. The response header is Content-Encoding: gzip and the body is 137,582 bytes gzipped, and 650,422 bytes after inflating.

The second and subsequent requests (whilst the cache remains valid) still return the Content-Encoding: gzip response header but the response body is not gzipped, just the uncompressed CSS, and the CSS is truncated at 105,759 bytes which matches the Content-Length response header also returned. This length of ~105KiB bears no relationship to either of the two included CSS files, in their original or gzipped forms.

If the pagespeed cache is allowed to expire naturally, or if the cache is explicitly purged with a PURGE HTTP request, the next immediate request will succeed again, but then future requests will return to the truncated and mis-encoded behaviour.

The behaviour is the same regardless of whether PageSpeed is configured to use memcached or file cache. When using file cache, the combined CSS files on the local PageSpeed disk contains the Content-Length: 105759 header in the cache file preamble. The original, successful request either uses Transfer-Encoding: Chunked or Connection: Close and never included this Content-Length response header.

Any requests for the non-gzipped resources (ie by excluding the Accept-Encoding: gzip request header) work correctly regardless of whether the object is already cached or not. Requests for non-gzipped responses also include the correct response header of Content-Length: 650423.

If I modify the combined CSS URL to exclude store.css, ie /css/A.vendor.css,Mcc.HBFc8IpUqh.css.pagespeed.cf.OBCdvpW_8K.css then PageSpeed works correctly, never truncating or mis-encoding the response. However, if I exclude vendor.css instead resulting in /css/A.store.css,Mcc.HBFc8IpUqh.css.pagespeed.cf.OBCdvpW_8K.css then requests for this URL experience the same mis-encoding problem on the cache hits.

The configured pagespeed directives in nginx.conf are:

pagespeed On;
pagespeed EnableCachePurge on;
pagespeed RewriteLevel PassThrough;
pagespeed InPlaceResourceOptimization on;
pagespeed FetchHttps enable;
pagespeed CustomFetchHeader X-Forwarded-Proto https;
pagespeed PreserveUrlRelativity on;
pagespeed EnableFilters "add_head,combine_css,combine_javascript,convert_meta_tags,extend_cache,fallback_rewrite_css_urls,flatten_css_imports,inline_css,inline_import_to_link,inline_javascript,rewrite_css,rewrite_images,rewrite_javascript,rewrite_style_attributes_with_url";

Is this enough information for you to identify the problematic code path in the PageSpeed source? What more do I need to provide to diagnose this bug?

oschaaf commented 6 years ago

That is some really good information. Intuitively I'd think that there is an issue specific to interaction with compression and the combiner(s). One more thing I'd also love to see is the debug level log lines in error.log for a request which ends up with a bad response. (This would require rebuilding ngx_pagespeed --with-debug, see https://docs.nginx.com/nginx/admin-guide/monitoring/debugging/).

jstangroome commented 6 years ago

@oschaaf I have posted the error.log with debug-level logging on nginx built --with-debug as a gist here: https://gist.github.com/jstangroome/83005610f84dcfaae161d54b384a6faa

Since the cached object contains the questionable Content-Length header, I also included the debug logs from the first, successful request too as a separate file (success-debug.log) in the same gist.

jstangroome commented 6 years ago

I've just realised that the failed response is not being truncated by pagespeed. My tests with curl were reading the Content-Length response header and truncating. If I pass --ignore-content-length to curl then the bug is simpler: pagespeed sends the cached object un-gzipped even though the Accept-Encoding: gzip request header is present, the Content-Encoding: gzip response header is returned and the pagespeed file cache object is stored in gzipped form.

jstangroome commented 6 years ago

Is pagespeed expecting nginx to do the gzipping of the response? My nginx.conf does not contain any of the nginx gzip-related directives, meaning the default is gzip off;.

jstangroome commented 6 years ago

Adding gzip on; and gzip_types *; to the http-scope of nginx.conf doesn't change the behaviour.

oschaaf commented 6 years ago

In this scenario, nginx is not expected to play a role in the gzipping.

As a workaround, adding pagespeed HttpCacheCompressionLevel 0; might work around the issue (as pagespeed will stop applying gzip compression itself).

jstangroome commented 6 years ago

@oschaaf Adding pagespeed HttpCacheCompressionLevel 0; does resolve the issue.

The requests, once the resource is cached, now return Transfer-Encoding: chunked and do not include a Content-Length response header at all. Most importantly the response body is gzipped in concordance with the Content-Encoding: gzip response header.

The cached object on disk at /var/ngx_pagespeed_cache/v3/eample.com/http,3A/,2Fwww.example.com/css/store.css+vendor.css.pagespeed.cc.HBFc8IpUqh.css, now contains the non-compressed form of the file, and the headers in the cache file's preamble no longer contain the invalid Content-Length header.

While this workaround solves the immediate issue, I'm wary of the impact it will have on an already large volume of cache space used.

Also, it is rather unintuitive that a cached object, stored in gzipped form, would be served ungzipped, but the cached object when stored in uncompressed form is correctly served gzipped. :S

jstangroome commented 6 years ago

For reference, I've added another gist showing the hex dump of the cached file header and beginning of the body for both HttpCacheCompressionLevel 0 (works) and HttpCacheCompressionLevel 6 (broken).

https://gist.github.com/jstangroome/0d22a0ece1f354a3fd8d78efd433ea69

oschaaf commented 6 years ago

I suspect that impact on required cache space of the workaround may not be that big: images probably make up the majority of that, and are not subject to compression. Having said that, I agree that fixing the root cause would be much nicer. The information in this issue should be very helpful figuring out what the root cause is.

jstangroome commented 6 years ago

I've confirmed that the seemingly unrelated Content-Length: 105759 response header in the failing responses directly corresponds to the resulting size of the response body when pagespeed gzips the content itself to store in the cache.

I assume the reason the length is different from the size of the gzipped cache miss response is due to different compression levels used for direct delivery versus cache storage.

jmarantz commented 6 years ago

See also: https://github.com/apache/incubator-pagespeed-mod/issues/1362 https://github.com/apache/incubator-pagespeed-ngx/issues/1346

It would be really great to solve this one; I suspect it's one bug affecting all three of these issues (this one and the above two).

jstangroome commented 6 years ago

I've done some further investigation.

The URL that is failing doesn't exactly match the URL present in the HTML. The pagespeed hashes in the URL are different. From this I've been able to reproduce the issue more reliably.

Take the URL from the HTML (it must be cc [combine_css] and cf [rewrite_css]), e.g.: /css/A.store.css+vendor.css,Mcc.HBFc8IpUqh.css.pagespeed.cf.OBCdvpW_8K.css

Change the cc hash, changing the first H to a h instead is enough in this example: /css/A.store.css+vendor.css,Mcc.hBFc8IpUqh.css.pagespeed.cf.OBCdvpW_8K.css

Request the URL once with Accept-Encoding: gzip and get a cache miss and a correct response. Request it again and get a cache hit and an incorrectly decompress body.

If only combine_css is used and rewrite_css is removed, it does not occur - it seems to be related to the way the two filters interact. Modifying the cf hash seems to have no impact.

oschaaf commented 6 years ago

@jstangroome are you able to set up a minimal reproduction scenario? I'd love to dig into this one, but so far efforts have failed to get this to happen on my dev machine. If we could just get this to happen while running gdb time to analysis and hopefully fix of the problem will probably greatly speed up.

jstangroome commented 6 years ago

@oschaaf This seems to be enough to reproduce the issue 90% of the time:

https://gist.github.com/jstangroome/b1cdf650fa6c4cc1919c310dcea8dc1a

I attempted to prune each file down to the minimum but I reached a point where if I pruned store.css to be much smaller the reproduction became much less reliable.

free6k commented 6 years ago

I have similar problem

nginx version: nginx/1.14.0
built by gcc 4.9.2 (Debian 4.9.2-10)
built with OpenSSL 1.0.2l  25 May 2017
TLS SNI support enabled
configure arguments: --add-module=/tmp/build-nginx-pagespeed/incubator-pagespeed-ngx-1.13.35.2-stable/ --user=nginx --group=nginx --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx --conf-path=/etc/nginx/nginx.conf --pid-path=/var/run/nginx.pid --lock-path=/var/run/nginx.lock --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --with-http_v2_module --with-http_gzip_static_module --with-http_stub_status_module --with-http_ssl_module --with-pcre --with-file-aio --with-http_realip_module --without-http_scgi_module --without-http_uwsgi_module --with-http_image_filter_module

I disable combine_css, and all work fine... before was 1/30 'error decoding content'

oschaaf commented 6 years ago

@jstangroome awesome, I have it reproduced now!

So one prerequisite that I think exists for the problem to show up is that the first request happens without Accept-Encoding: gzip. Then subsequent requests that do indicate Accept-Encoding: gzip will get the problematic responses.

You also noted that pruning the css-file smaller reduces likelihood of the problem to show up. One thing I can think of is that perhaps nginx is able to correct the content-length if the full response can be fitted into a single output buffer -- masking the problem. But that is speculation.

I'll dive in, and update here when i know more.

jstangroome commented 6 years ago

the first request happens without Accept-Encoding: gzip

@oschaaf all my testing, in Production, and with an isolated local VM, has been with Accept-Encoding: gzip used for every request but I'm sure you'll uncover the source of the issue soon and we'll know for sure.

oschaaf commented 6 years ago

Well actually .. in the script that demonstrates the issue there is:

good_css_url=$(curl -s http://localhost/ | grep -Eo '[^"]+,Mcc\.[^"]+\.pagespeed\.[^"]+')

That omits a-e:gzip as far as I know. But it may be a red herring.

jstangroome commented 6 years ago

True, I had forgotten about that. I guess a non-gzip request for the HTML may have lead to background pre-fetches for the CSS without gzip too.

I just changed that line in my local to the following and the issue is still reproducible so it should rule out that scenario.

good_css_url=$(curl --compressed -s http://localhost/ | grep -Eo '[^"]+,Mcc\.[^"]+\.pagespeed\.[^"]+')
oschaaf commented 6 years ago

So I caught the http cache in gdb writing the bad entry. The C-L is 7411 with C-E gzip, but the body content is uncompressed and a lot larger. I suspect that this problem is masked for small response sizes by recovery mechanisms both at the server and browser level, which is why it isn't showing up reliably.

(gdb) p response_headers->ToString()
$14 = "HTTP/1.1 200 OK\r\nServer: nginx/1.13.4\r\nAccept-Ranges: bytes\r\nContent-Type: text/css\r\nDate: Tue, 15 May 2018 06:42:09 GMT\r\nExpires: Wed, 15 May 2019 06:42:09 GMT\r\nCache-Control: max-age=31536000\r\nEtag: W/\"0\"\r\nLast-Modified: Tue, 15 May 2018 06:42:09 GMT\r\nX-Original-Content-Length: 50878\r\nVary: Accept-Encoding\r\nContent-Encoding: gzip\r\nContent-Length: 7411\r\n\r\n"
(gdb) bt
#0  net_instaweb::HTTPCache::PutInternal (this=0x188efb8, preserve_response_headers=false, key="http://localhost/store.css+vendor.css.pagespeed.cc.iKEaqUIgkT.css", fragment="localhost", 
    start_us=1526366529263953, value=0x7ffff2034c50, response_headers=0x7ffff2034d50, handler=0x1a22d78) at net/instaweb/http/http_cache.cc:457
#1  0x0000000000706979 in net_instaweb::HTTPCache::Put (this=0x188efb8, key="http://localhost/store.css+vendor.css.pagespeed.cc.iKEaqUIgkT.css", fragment="localhost", req_properties=..., 
    http_options=..., value=0x7fffe40079a0, handler=0x1a22d78) at net/instaweb/http/http_cache.cc:495
#2  0x00000000005d4c82 in net_instaweb::RewriteDriver::Write (this=0x7fffe4002f98, inputs=std::vector of length 2, capacity 2 = {...}, contents=..., 
    type=0x17723c0 <net_instaweb::(anonymous namespace)::kTypes+96>, charset=..., output=0x7fffe4007978) at net/instaweb/rewriter/rewrite_driver.cc:3362
#3  0x0000000000b3fb1d in net_instaweb::ResourceCombiner::WriteCombination (this=0x7fffe40084b8, combine_resources=std::vector of length 2, capacity 2 = {...}, combination=..., handler=0x1a22d78)
    at net/instaweb/rewriter/resource_combiner.cc:287
#4  0x000000000066e821 in net_instaweb::CssCombineFilter::CssCombiner::Write (this=0x7fffe40084b8, in=std::vector of length 2, capacity 2 = {...}, out=...)
    at net/instaweb/rewriter/css_combine_filter.cc:123
#5  0x000000000066f0d1 in net_instaweb::CssCombineFilter::Context::Rewrite (this=0x7fffe4008338, partition_index=0, partition=0x7fffe4007ee8, output=...)
    at net/instaweb/rewriter/css_combine_filter.cc:266
#6  0x0000000000b568a8 in net_instaweb::RewriteContext::InvokeRewriteFunction::Run (this=0x7fffe4000c48) at net/instaweb/rewriter/rewrite_context.cc:958
#7  0x0000000000725d8c in net_instaweb::Function::CallRun (this=0x7fffe4000c48) at pagespeed/kernel/base/function.cc:51
#8  0x00000000008a75c8 in net_instaweb::QueuedWorkerPool::Run (this=0x1a06068, sequence=0x7fffe4003ad8, worker=0x1a2e0f8) at pagespeed/kernel/thread/queued_worker_pool.cc:157
#9  0x00000000008ad5cc in net_instaweb::MemberFunction2<net_instaweb::QueuedWorkerPool, net_instaweb::QueuedWorkerPool::Sequence*, net_instaweb::QueuedWorker*>::Run (this=0x7fffdc031ef8)
    at ./pagespeed/kernel/base/function.h:202
#10 0x0000000000725d8c in net_instaweb::Function::CallRun (this=0x7fffdc031ef8) at pagespeed/kernel/base/function.cc:51
#11 0x00000000008b3b64 in net_instaweb::Worker::WorkThread::Run (this=0x1a2e128) at pagespeed/kernel/thread/worker.cc:85
#12 0x00000000008b6ffc in net_instaweb::PthreadThreadImpl::InvokeRun (self_ptr=0x1a2dd08) at pagespeed/kernel/thread/pthread_thread_system.cc:121
#13 0x00007ffff79bd6ba in start_thread (arg=0x7ffff2036700) at pthread_create.c:333
#14 0x00007ffff638341d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

At the surface, it looks like ResourceCombiner::WriteCombination sanitizes the output headers, but not the ones it stores into the cache.

Trying out the following change to see how that turns out:

diff --git a/net/instaweb/rewriter/resource_combiner.cc b/net/instaweb/rewriter/resource_combiner.cc
index e978236..25b971e 100644
--- a/net/instaweb/rewriter/resource_combiner.cc
+++ b/net/instaweb/rewriter/resource_combiner.cc
@@ -278,7 +278,8 @@ bool ResourceCombiner::WriteCombination(
     for (int i = 1, n = combine_resources.size(); i < n; ++i) {
       output_headers->RemoveIfNotIn(*combine_resources[i]->response_headers());
     }
-
+    combination->response_headers()->RemoveAll(HttpAttributes::kContentEncoding);
+    combination->response_headers()->RemoveAll(HttpAttributes::kContentLength);
     // TODO(morlovich): Fix combiners to deal with charsets.
     written =
         rewrite_driver_->Write(
oschaaf commented 6 years ago

The change above doesn't help. It probably wouldn't have been the right fix, but naively I would have expected it to at least influence incidence it.

Hmm. The intermittent nature of the problem makes me wonder if there is something else going on, a race of some sort: no cache flushes are required to excite the bug, and the system should get into a steady state after a couple of requests. Is there some sort of compression/decompression cycle going on where they sometimes step on each other?

morlovich commented 6 years ago

On Tue, May 15, 2018 at 3:53 AM Otto van der Schaaf < notifications@github.com> wrote:

The change above doesn't help. It probably wouldn't have been the right fix, but naively I would have expected it to at least influence incidence it.

That makes it look like the headers are modified to include this stuff in either RewriteDriver::Write or HTTPCache::Put... hmm, which just makes this code look like the likeliest suspect: https://github.com/apache/incubator-pagespeed-mod/blob/master/net/instaweb/http/http_cache.cc#L412 ... but that doesn't look wrong, either.

oschaaf commented 6 years ago

@morlovich Yeah, I have been looking at that too..

What I think is really odd is that this script seems to excite the issue fairly randomly, even with warm caches:

https://gist.githubusercontent.com/jstangroome/b1cdf650fa6c4cc1919c310dcea8dc1a/raw/b084869d8a5a9e09e3612fce690f5050493ba076/test.sh

oschaaf commented 6 years ago

So the following diff helps with the bad content-length and mismatching c-e headers:

diff --git a/net/instaweb/rewriter/css_filter.cc b/net/instaweb/rewriter/css_filter.cc
index 8f653cd..7724d5e 100644
--- a/net/instaweb/rewriter/css_filter.cc
+++ b/net/instaweb/rewriter/css_filter.cc
@@ -253,6 +253,8 @@ bool CssFilter::Context::SendFallbackResponse(
     MessageHandler* handler) {
   // Do not set the content length, since we may need to mutate the
   // content as we stream out the bytes to correct for URL changes.
+  async_fetch->response_headers()->RemoveAll(HttpAttributes::kContentEncoding);
+  async_fetch->response_headers()->RemoveAll(HttpAttributes::kContentLength);
   async_fetch->HeadersComplete();

   DCHECK_EQ(1, num_slots());

But I'm not sure that this is the right thing to do.

Also, while testing this diff, I observed that the combined css response alternates in size. It looks like the minified and non-minified version of the combination alternate randomly when responses are served. This may be a related problem, or a totally different problem, I'm not sure.

morlovich commented 6 years ago

Huh. But that's CSS filter, not combiner, and we shouldn't be copying those sorts of headers from input.... I wonder if we're aliasing something we shouldn't be.

On Tue, May 22, 2018 at 3:53 AM Otto van der Schaaf < notifications@github.com> wrote:

So the following diff helps with the bad content-length and mismatching c-e headers:

diff --git a/net/instaweb/rewriter/css_filter.cc b/net/instaweb/rewriter/css_filter.cc index 8f653cd..7724d5e 100644 --- a/net/instaweb/rewriter/css_filter.cc +++ b/net/instaweb/rewriter/css_filter.cc @@ -253,6 +253,8 @@ bool CssFilter::Context::SendFallbackResponse( MessageHandler* handler) { // Do not set the content length, since we may need to mutate the // content as we stream out the bytes to correct for URL changes.

  • async_fetch->response_headers()->RemoveAll(HttpAttributes::kContentEncoding);
  • async_fetch->response_headers()->RemoveAll(HttpAttributes::kContentLength); async_fetch->HeadersComplete();

    DCHECK_EQ(1, num_slots());

But I'm not sure that this is the right thing to do.

Also, while testing this diff, I observed that the combined css response alternates in size. It looks like the minified and non-minified version of the combination alternate randomly when responses are served. This may be a related problem, or a totally different problem, I'm not sure.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/apache/incubator-pagespeed-ngx/issues/1556#issuecomment-390896994, or mute the thread https://github.com/notifications/unsubscribe-auth/ADl1RIOSY5nc2lTzMbcxtUsYishukXCIks5t08PxgaJpZM4T34G3 .

oschaaf commented 6 years ago

I've been searching for code that messes up consistency, but no luck so far..

You can see the effect of the patched version here on a combination.. http://35.187.191.47/A.store.css+vendor.css,Mcc.00EaqUIgkT.css.pagespeed.cf.tuzrpz_-Py.css (and also observe the resulting alternating response, but at least there is no bad content-length and no content-encoding mismatch).

Also, I had the backtrace for when the content-length 7411 is written to the cache in notepad here. 7411 is interesting because that shows when the combination output is written (uncompressed, with an actual larger body size, and actually not gzipped but the headers do say so).

At the point when it's written to cache it seems consistent and OK to me:

(gdb) bt
#0  net_instaweb::Headers<net_instaweb::HttpResponseHeaders>::Add (this=0x7ffff2034d50, name=..., value=...) at pagespeed/kernel/http/headers.cc:309
#1  0x00000000007ceed4 in net_instaweb::Headers<net_instaweb::HttpResponseHeaders>::Replace (this=0x7ffff2034d50, name=..., value=...) at pagespeed/kernel/http/headers.cc:575
#2  0x00000000007dddb6 in net_instaweb::ResponseHeaders::SetContentLength (this=0x7ffff2034d50, content_length=7411) at pagespeed/kernel/http/response_headers.cc:531
#3  0x000000000070dc6b in net_instaweb::InflatingFetch::GzipValue (compression_level=9, http_value=..., compressed_value=0x7ffff2034c50, headers=0x7ffff2034d50, handler=0x1a22d68) at net/instaweb/http/inflating_fetch.cc:158
#4  0x000000000070639e in net_instaweb::HTTPCache::PutInternal (this=0x188f018, preserve_response_headers=false, key="http://localhost/store.css+vendor.css.pagespeed.cc.iKEaqUIgkT.css", fragment="localhost", start_us=1526930452779889, value=0x7fffe4028e38, 
    response_headers=0x7ffff2034d50, handler=0x1a22d68) at net/instaweb/http/http_cache.cc:433
#5  0x0000000000706979 in net_instaweb::HTTPCache::Put (this=0x188f018, key="http://localhost/store.css+vendor.css.pagespeed.cc.iKEaqUIgkT.css", fragment="localhost", req_properties=..., http_options=..., value=0x7fffe400b550, handler=0x1a22d68)
    at net/instaweb/http/http_cache.cc:495
#6  0x00000000005d4c82 in net_instaweb::RewriteDriver::Write (this=0x1a29708, inputs=std::vector of length 2, capacity 2 = {...}, contents=..., type=0x17723c0 <net_instaweb::(anonymous namespace)::kTypes+96>, charset=..., output=0x7fffe400b528)
    at net/instaweb/rewriter/rewrite_driver.cc:3362
#7  0x0000000000b3fb81 in net_instaweb::ResourceCombiner::WriteCombination (this=0x7fffe4008648, combine_resources=std::vector of length 2, capacity 2 = {...}, combination=..., handler=0x1a22d68) at net/instaweb/rewriter/resource_combiner.cc:288
#8  0x000000000066e821 in net_instaweb::CssCombineFilter::CssCombiner::Write (this=0x7fffe4008648, in=std::vector of length 2, capacity 2 = {...}, out=...) at net/instaweb/rewriter/css_combine_filter.cc:123
#9  0x000000000066f0d1 in net_instaweb::CssCombineFilter::Context::Rewrite (this=0x7fffe40084c8, partition_index=0, partition=0x7fffe4008c28, output=...) at net/instaweb/rewriter/css_combine_filter.cc:266
#10 0x0000000000b5690c in net_instaweb::RewriteContext::InvokeRewriteFunction::Run (this=0x7fffe4008858) at net/instaweb/rewriter/rewrite_context.cc:958
#11 0x0000000000725d8c in net_instaweb::Function::CallRun (this=0x7fffe4008858) at pagespeed/kernel/base/function.cc:51
#12 0x00000000008a75c8 in net_instaweb::QueuedWorkerPool::Run (this=0x1a06058, sequence=0x1a278a8, worker=0x1a2e0e8) at pagespeed/kernel/thread/queued_worker_pool.cc:157
#13 0x00000000008ad5cc in net_instaweb::MemberFunction2<net_instaweb::QueuedWorkerPool, net_instaweb::QueuedWorkerPool::Sequence*, net_instaweb::QueuedWorker*>::Run (this=0x7fffe0031e78) at ./pagespeed/kernel/base/function.h:202
#14 0x0000000000725d8c in net_instaweb::Function::CallRun (this=0x7fffe0031e78) at pagespeed/kernel/base/function.cc:51
#15 0x00000000008b3b64 in net_instaweb::Worker::WorkThread::Run (this=0x1a2e118) at pagespeed/kernel/thread/worker.cc:85
#16 0x00000000008b6ffc in net_instaweb::PthreadThreadImpl::InvokeRun (self_ptr=0x1a2dcf8) at pagespeed/kernel/thread/pthread_thread_system.cc:121
#17 0x00007ffff79bd6ba in start_thread (arg=0x7ffff2036700) at pthread_create.c:333
#18 0x00007ffff638341d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
(gdb) f 3
#3  0x000000000070dc6b in net_instaweb::InflatingFetch::GzipValue (compression_level=9, http_value=..., compressed_value=0x7ffff2034c50, headers=0x7ffff2034d50, handler=0x1a22d68) at net/instaweb/http/inflating_fetch.cc:158
158         headers->SetContentLength(deflated.length());
(gdb) p deflated
$4 = "\037\213\b\000\000\000\000\000\002\003\325=is\333F\226\337\371+\260V\251\306Έ0\t\222\272\\q\255\255$\023W\331\023W\234L>\244\262,\220l\211X\203\000\027\000-+\256\374\367\355\v\350\353u\243\001B\231L\016Y\006\032\257\357w\037Ͽ\n~\212W)\252\202\257\236\217\376{\207\066I\034<\335ş\307\367ɦ\332^\a\323\350r\262\377\374,\370\062\n\202\223\262\312\v\364\066_\307\370\317\367\361\035\nN\366\344\347\313 ,\362{\332$\b\366\361f\223dw\327\301$\210\360\227/\350\303U\376y\\&\277\323竼ؠb\214\037\275\b\376\b\376\030\215\236\177\025\274\313WI\212,C\270\\8F\260\235\362~", <incomplete sequence \364>...
(gdb) p deflated.length()
$5 = 7411
morlovich commented 6 years ago

On Tue, May 22, 2018 at 12:01 PM Otto van der Schaaf < notifications@github.com> wrote:

I've been searching for code that messes up consistency, but no luck so far..

You can see the effect of the patched version here on a combination.. http://35.187.191.47/A.store.css+vendor.css,Mcc.00EaqUIgkT.css.pagespeed.cf.tuzrpz_-Py.css (and also observe the resulting alternating response, but at least there is no bad content-length and no content-encoding mismatch).

Also, I had the backtrace for when the content-length 7411 is written to the cache in notepad here. 7411 is interesting because that shows when the combination output is written (uncompressed, with an actual larger body size, and actually not gzipped but the headers do say so).

At the point when it's written to cache it seems consistent and OK to me:

(gdb) bt

0 net_instaweb::Headers::Add (this=0x7ffff2034d50, name=..., value=...) at pagespeed/kernel/http/headers.cc:309

1 0x00000000007ceed4 in net_instaweb::Headers::Replace (this=0x7ffff2034d50, name=..., value=...) at pagespeed/kernel/http/headers.cc:575

2 0x00000000007dddb6 in net_instaweb::ResponseHeaders::SetContentLength (this=0x7ffff2034d50, content_length=7411) at pagespeed/kernel/http/response_headers.cc:531

3 0x000000000070dc6b in net_instaweb::InflatingFetch::GzipValue (compression_level=9, http_value=..., compressed_value=0x7ffff2034c50, headers=0x7ffff2034d50, handler=0x1a22d68) at net/instaweb/http/inflating_fetch.cc:158

4 0x000000000070639e in net_instaweb::HTTPCache::PutInternal (this=0x188f018, preserve_response_headers=false, key="http://localhost/store.css+vendor.css.pagespeed.cc.iKEaqUIgkT.css", fragment="localhost", start_us=1526930452779889, value=0x7fffe4028e38,

response_headers=0x7ffff2034d50, handler=0x1a22d68) at net/instaweb/http/http_cache.cc:433

5 0x0000000000706979 in net_instaweb::HTTPCache::Put (this=0x188f018, key="http://localhost/store.css+vendor.css.pagespeed.cc.iKEaqUIgkT.css", fragment="localhost", req_properties=..., http_options=..., value=0x7fffe400b550, handler=0x1a22d68)

at net/instaweb/http/http_cache.cc:495

6 0x00000000005d4c82 in net_instaweb::RewriteDriver::Write (this=0x1a29708, inputs=std::vector of length 2, capacity 2 = {...}, contents=..., type=0x17723c0 <net_instaweb::(anonymous namespace)::kTypes+96>, charset=..., output=0x7fffe400b528)

at net/instaweb/rewriter/rewrite_driver.cc:3362

7 0x0000000000b3fb81 in net_instaweb::ResourceCombiner::WriteCombination (this=0x7fffe4008648, combine_resources=std::vector of length 2, capacity 2 = {...}, combination=..., handler=0x1a22d68) at net/instaweb/rewriter/resource_combiner.cc:288

8 0x000000000066e821 in net_instaweb::CssCombineFilter::CssCombiner::Write (this=0x7fffe4008648, in=std::vector of length 2, capacity 2 = {...}, out=...) at net/instaweb/rewriter/css_combine_filter.cc:123

9 0x000000000066f0d1 in net_instaweb::CssCombineFilter::Context::Rewrite (this=0x7fffe40084c8, partition_index=0, partition=0x7fffe4008c28, output=...) at net/instaweb/rewriter/css_combine_filter.cc:266

10 0x0000000000b5690c in net_instaweb::RewriteContext::InvokeRewriteFunction::Run (this=0x7fffe4008858) at net/instaweb/rewriter/rewrite_context.cc:958

11 0x0000000000725d8c in net_instaweb::Function::CallRun (this=0x7fffe4008858) at pagespeed/kernel/base/function.cc:51

12 0x00000000008a75c8 in net_instaweb::QueuedWorkerPool::Run (this=0x1a06058, sequence=0x1a278a8, worker=0x1a2e0e8) at pagespeed/kernel/thread/queued_worker_pool.cc:157

13 0x00000000008ad5cc in net_instaweb::MemberFunction2<net_instaweb::QueuedWorkerPool, net_instaweb::QueuedWorkerPool::Sequence, net_instaweb::QueuedWorker>::Run (this=0x7fffe0031e78) at ./pagespeed/kernel/base/function.h:202

14 0x0000000000725d8c in net_instaweb::Function::CallRun (this=0x7fffe0031e78) at pagespeed/kernel/base/function.cc:51

15 0x00000000008b3b64 in net_instaweb::Worker::WorkThread::Run (this=0x1a2e118) at pagespeed/kernel/thread/worker.cc:85

16 0x00000000008b6ffc in net_instaweb::PthreadThreadImpl::InvokeRun (self_ptr=0x1a2dcf8) at pagespeed/kernel/thread/pthread_thread_system.cc:121

17 0x00007ffff79bd6ba in start_thread (arg=0x7ffff2036700) at pthread_create.c:333

18 0x00007ffff638341d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

(gdb) f 3

3 0x000000000070dc6b in net_instaweb::InflatingFetch::GzipValue (compression_level=9, http_value=..., compressed_value=0x7ffff2034c50, headers=0x7ffff2034d50, handler=0x1a22d68) at net/instaweb/http/inflating_fetch.cc:158

158 headers->SetContentLength(deflated.length()); (gdb) p deflated $4 = "\037\213\b\000\000\000\000\000\002\003\325=is\333F\226\337\371+\260V\251\306Έ0\t\222\272\q\255\255$\023W\331\023W\234L>\244\262,\220l\211X\203\000\027\000-+\256\374\367\355\v\350\353u\243\001B\231L\016Y\006\032\257\357w\037Ͽ\n~\212W)\252\202\257\236\217\376{\207\066I\034<\335ş\307\367ɦ\332^\a\323\350r\262\377\374,\370\062\n\202\223\262\312\v\364\066_\307\370\317\367\361\035\nN\366\344\347\313 ,\362{\332$\b\366\361f\223dw\327\301$\210\360\227/\350\303U\376y\&\277\323竼ؠb\214\037\275\b\376\b\376\030\215\236\177\025\274\313WI\212,C\270\8F\260\235\362~", <incomplete sequence \364>... (gdb) p deflated.length() $5 = 7411

That's basically the same backtrace as the one where the value is uncompressed? What exactly did you mean by "body content being larger" there --- e.g. which objects did you look at?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/apache/incubator-pagespeed-ngx/issues/1556#issuecomment-391046316, or mute the thread https://github.com/notifications/unsubscribe-auth/ADl1RIcR0E9JgFHJWCiAmhFgmfdqOvQkks5t1DZUgaJpZM4T34G3 .

oschaaf commented 6 years ago

Re: "What exactly did you mean by "body content being larger":

So sometimes (without the patch above) http://35.187.191.47/A.store.css+vendor.css,Mcc.00EaqUIgkT.css.pagespeed.cf.tuzrpz_-Py.css responds with 'Content-Length: 7411' and 'Content-Encoding: gzip', which both are lies. The actual content that gets put on the wire has a lot more bytes and is not gzip-compressed.

The backtrace above is something I captured to see when Content-Length:7411 gets added to the response headers, to see if that would provide clues.. (Frame number 4 indicates that the cache key involved is http://localhost/store.css+vendor.css.pagespeed.cc.iKEaqUIgkT.css)

morlovich commented 6 years ago

WRT to backtrace, I meant one in https://github.com/apache/incubator-pagespeed-ngx/issues/1556#issuecomment-389066123 vs. one you added today --- I am wondering why in the old one you thought that was the case when things were going wrong.

oschaaf commented 6 years ago

@morlovich Ah..I forgot about that backtrace.. To be honest my memory of that is bit blurry right now because I've been looking at so many things between then and now. Maybe back then the content associated to the headers looked inconsistent there? Unfortunately, I didn't capture that in the comment :(

Looking at the first backtrace in https://github.com/apache/incubator-pagespeed-ngx/issues/1556#issuecomment-389066123 again, what strikes me is that the response headers already have the gzip header set in that trace, while ResourceCombiner::WriteCombination works with uncompressed contents right? In the second backtrace, InflatingFetch::GzipValue gets invoked to add it (and perform actual compression).

morlovich commented 6 years ago

If line numbers match up, that was after the InflatingFetch::GzipValue call, and since preserve_response_headers = false, response_headers would have been modified by GzipValue...

morlovich commented 6 years ago

Unit test on https://github.com/apache/incubator-pagespeed-mod/tree/morlovich-fix-encoding-mismatch ... pondering how to structure the fix.

moorthi-n commented 5 years ago

I have reproduced a similar issue in modpagespeed latest build with rewrite_css turned on. It is intermittent and I noticed an inflated gziped content-length when the issue was reproduced. Not able to reproduce it again. Was there any progress on this issue so far?. Will be able to provide additional details and logs if required.

oschaaf commented 5 years ago

@moorthi-n So when you say latest build, do you mean you build from master?

danidorado commented 4 years ago

Hello, I'm experiencing same issue ERR_CONTENT_DECODING_FAILED with minified and combined css files.

setting up this directive is not helping and from time to time issue is still there making the frontend to fail breaking functionalities. ModPagespeedHttpCacheCompressionLevel 0

I'd love to share access and debug this issue with anyone on the team as i'm able to reproduce the issue easily.

Looking forward for someone who is willing to fix this long term issue.

Thanks in advance