apache / incubator-pagespeed-ngx

Automatic PageSpeed optimization module for Nginx
http://ngxpagespeed.com/
Apache License 2.0
4.37k stars 365 forks source link

ModifyCachingHeaders should only apply to html responses #1535

Open tobsch opened 6 years ago

tobsch commented 6 years ago

I am using ngx pagespeed (x-page-speed: 1.12.34.3-0) on the google cloud (k8s in alpine, +downstream proxy/cache). I want the pages itself to be cached, as I can't afford the load to hit the backends.

As the documentation states for this case, I use the following config:

pagespeed ModifyCachingHeaders Off;

# do not allow beaconing as we are behind a reverse proxy
pagespeed CriticalImagesBeaconEnabled false;
pagespeed DisableFilters prioritize_critical_css;
...

Normal pages are cached as expected (I set the caching headers in the nginx config), all the static assets that pagespeed handles though do not have a caching header (even if I use "extend_cache").

Is there any way to get around pagespeed behaviour here or how is best practice for that?

oschaaf commented 6 years ago

Did you see https://www.modpagespeed.com/doc/downstream-caching ?

tobsch commented 6 years ago

@oschaaf Sure. And I am behind a "Dumb" CDN (google) and this is why I disabled the beaconing and such (see above). Question is why the caching headers are missing. Did I miss something in the doc?

oschaaf commented 6 years ago

If you query the origin directly, do the caching headers show? If not -- if you disable pagespeed and query the origin, do they show?

Otto

On Fri, Feb 9, 2018 at 3:18 PM Tobias Schlottke notifications@github.com wrote:

@oschaaf https://github.com/oschaaf Sure. And I am behind a "Dumb" CDN (google) and this is why I disabled the beaconing and such (see above). Question is why the caching headers are missing. Did I miss something in the doc?

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/apache/incubator-pagespeed-ngx/issues/1535#issuecomment-364445715, or mute the thread https://github.com/notifications/unsubscribe-auth/ACIsRFpP70mjCxpg9rEeoi596RM7alDRks5tTFO1gaJpZM4R-HVL .

tobsch commented 6 years ago

Hi Otto,

I have got this configuration set up for caching of .pagespeed. resources:

# Ensure requests for pagespeed optimized resources go to the pagespeed handler
# and no extraneous headers get set.
location ~ "\.pagespeed\." {
  add_header "Cache-Control: public, max-age=86400";
}

If I disable pagespeed, there are no .pagespeed. resources but if I use the same config for all static resources, it obviously works.

I now played with both, ModifyCachingHeaders and extend_cache with the following results:

pagespeed ModifyCachingHeaders Off, pagespeed EnableFilters extend_cache:

HTTP/1.1 200 OK
Date: Tue, 13 Feb 2018 06:31:29 GMT
Content-Type: text/css
Connection: keep-alive
Server: nginx
Accept-Ranges: bytes
X-Original-Content-Length: 32893
Vary: Accept-Encoding
Content-Length: 25405
X-Page-Speed: 1.12.34.3-0

pagespeed ModifyCachingHeaders On, pagespeed EnableFilters extend_cache:

HTTP/1.1 200 OK
Content-Type: text/css
Connection: keep-alive
Server: nginx
Accept-Ranges: bytes
Date: Thu, 08 Feb 2018 22:06:44 GMT
Expires: Fri, 08 Feb 2019 22:06:44 GMT
ETag: W/"0"
Last-Modified: Thu, 08 Feb 2018 22:06:44 GMT
Cache-Control: max-age=31536000, public
X-Original-Content-Length: 32893
Vary: Accept-Encoding
Content-Length: 25405
X-Page-Speed: 1.12.34.3-

Disabling extend_cache has the same result as the first example.

Conclusion: ModifyCachingHeaders On and extend_cache in combination are closest to the expected result. I cannot turn ModifyCachingHeaders to On as caching HTML won't work anymore (A case which @igrigorik states in his great design doc: https://github.com/apache/incubator-pagespeed-mod/wiki/Design-Doc:-HTML-Caching-plus-PageSpeed) I also can't modify the headers manually as pagespeed suppresses this. Also, the ETag: W/"0" will lead to minus points when checking with Pagespeed insights.

What can I do? Why can't I override caching for those resources, if I need to?

Tobias

oschaaf commented 6 years ago

Not sure if this works for cache directives but it is worth a try.. Could you try if you can use https://www.modpagespeed.com/doc/configuration#add-resource-header to massage the final resource response headers into what you need?

Otto On Tue, 13 Feb 2018 at 07:54, Tobias Schlottke notifications@github.com wrote:

Hi Otto,

I have got this configuration set up for caching of .pagespeed. resources:

Ensure requests for pagespeed optimized resources go to the pagespeed handler

and no extraneous headers get set.

location ~ ".pagespeed." { add_header "Cache-Control: public, max-age=86400"; }

If I disable pagespeed, there are no .pagespeed. resources but if I use the same config for all static resources, it obviously works.

I now played with both, ModifyCachingHeaders and extend_cache with the following results:

pagespeed ModifyCachingHeaders Off, pagespeed EnableFilters extend_cache:

HTTP/1.1 200 OK Date: Tue, 13 Feb 2018 06:31:29 GMT Content-Type: text/css Connection: keep-alive Server: nginx Accept-Ranges: bytes X-Original-Content-Length: 32893 Vary: Accept-Encoding Content-Length: 25405 X-Page-Speed: 1.12.34.3-0

pagespeed ModifyCachingHeaders On, pagespeed EnableFilters extend_cache:

HTTP/1.1 200 OK Content-Type: text/css Connection: keep-alive Server: nginx Accept-Ranges: bytes Date: Thu, 08 Feb 2018 22:06:44 GMT Expires: Fri, 08 Feb 2019 22:06:44 GMT ETag: W/"0" Last-Modified: Thu, 08 Feb 2018 22:06:44 GMT Cache-Control: max-age=31536000, public X-Original-Content-Length: 32893 Vary: Accept-Encoding Content-Length: 25405 X-Page-Speed: 1.12.34.3-

Disabling extend_cache has the same result as the first example.

Conclusion: ModifyCachingHeaders On and extend_cache in combination are closest to the expected result. I cannot turn ModifyCachingHeaders to On as caching HTML won't work anymore. I also can't modify the headers manually as pagespeed suppresses this. Also, the ETag: W/"0" will lead to minus points when checking with Pagespeed insights.

What can I do? Why can't I override caching for those resources, if I need to?

Tobias

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/apache/incubator-pagespeed-ngx/issues/1535#issuecomment-365168160, or mute the thread https://github.com/notifications/unsubscribe-auth/ACIsRNqOBG-lvmeNkQx5q4lxYBQSp6Osks5tUTGjgaJpZM4R-HVL .

tobsch commented 6 years ago

Hi Otto,

I already tried this, nginx rejects to start if I set Cache-Control:

nginx_1         | 2018/02/13 10:02:34 [emerg] 72#72: "pagespeed" directive "AddResourceHeader Cache-Control public, max-age=600, stale-while-revalidate=86400, stale-if-error=86400" 

Rejecting header 'Cache-Control ' in /var/www/html/conf/nginx/pagespeed.conf:114

For all other headers it works:

X-Foo: public, max-age=600, stale-while-revalidate=86400, stale-if-error=86400

From my perspective, this is a bug and ModifyCacheHeaders should be independent from extend_cache. If it is disabled, I still want to be able to use extend_cache for the .pagespeed. resources.

If I am expert, I also want to be able to set caching headers for the .pagespeed. - resources manually?

What do you think?

oschaaf commented 6 years ago

Well, first I thought that ModifyCachingHeaders pretty much does what it says on the tin when you turn it off. But after reading https://www.modpagespeed.com/doc/configuration#ModifyCachingHeaders it seems that it should be only applying to html responses. Assuming that is correct, that's a bug, yes.

oschaaf commented 6 years ago

Updated the title to describe the bug more accurately

oschaaf commented 6 years ago

The relevant code is over at https://github.com/apache/incubator-pagespeed-ngx/blob/master/src/ngx_pagespeed.cc#L1892

AFAICT mod_pagespeed's build_context_for_request() is only called in the html filter. At a glance it seems the code linked above needs to moved so it only executes in the html rewriting flow.

tobsch commented 6 years ago

I have to admit it is a little misleading but you want to be able to use them individually. so who can do this then :-) ?

oschaaf commented 6 years ago

Well, it may take me a while before I get around to this, there are some other things in the project that are having higher priority at the moment.

tobsch commented 6 years ago

hey @oschaaf - a short reminder. I am totally keen to try pagespeed "in the wild" but currently can't...

Best wishes & many thanks,

Tobias

tobsch commented 6 years ago

Hey @oschaaf,

any chance to get this through the door somehow? I would love to finally use pagespeed in production.

Tobias

oschaaf commented 6 years ago

@tobsch I haven't forgotten, but unfortunately I simply haven't got a chance to get around to it yet, sorry!

tobsch commented 6 years ago

@oschaaf Took a look at the code and found out that setting DownstreamCachePurgeLocationPrefix helps to prevent the unwanted behaviour.

pagespeed DownstreamCachePurgeLocationPrefix http://localhost:80;

But I think there should be a more explicit way to do so, this feels hacky. My best idea would be that you add an explicit EnableDownstreamCaching option which is implicitly being set if someone uses DownstreamCachePurgeLocationPrefix.

What do you think?

stufan commented 6 years ago

@tobsch I tried your approach but if doesn't work for me. Should the prefix be a real URL i.e. returning 200? Here is a link of my findings: https://github.com/apache/incubator-pagespeed-ngx/issues/1322#issuecomment-397905894

XVII commented 4 years ago

I'm getting this too after using ModifyCachingHeaders off. Watching issue.