apache / incubator-pagespeed-mod

Apache module for rewriting web pages to reduce latency and bandwidth.
http://modpagespeed.com
Apache License 2.0
693 stars 157 forks source link

Ressources not written because restrictive cache almost 100% and cache misses over 70% #2034

Closed ajconejo closed 4 years ago

ajconejo commented 4 years ago

Hi

I have installed pagespeed 1.13.35.2-0 with the following configuration and console shows that ressorces are not written because of restrictive cache control and have a high rate of cache misses.

Message history does not show any special message about cache misses.

apache logs does not show any special message either.

I have some warns like RateController: drop deferred fetch WARNING:serf_url_async_fetcher.cc(1222)

What could be the cause? Is there a way to see what content is being missed? I tried to enable filter debug but apache error/access logs shows nothing

Thanks

=============== I´m on Ubuntu 18.04

pagespeed info

Version: 14: on

Filters gp Convert Gif to Png jp Convert Jpeg to Progressive jw Convert Jpeg To Webp pj Convert Png to Jpeg db Debug di Delay Images hw Flushes html ji Inline Javascript io In-place optimize for browser js Jpeg Subsampling co Outline Css rj Recompress Jpeg rp Recompress Png rw Recompress Webp cf Rewrite Css jm Rewrite External Javascript jj Rewrite Inline Javascript cp Strip Image Color Profiles md Strip Image Meta Data

Options DefaultSharedMemoryCacheKB (dsmc) 50000 EnableRewriting (e) 1 FetchWithGzip (afg) True FileCacheCleanIntervalMs (afcci) 3600000 FileCacheInodeLimit (afcl) 500000 FileCachePath (afcp) /var/cache/mod_pagespeed/ FileCacheSizeKb (afc) 409600 HonorCsp (hcsp) True ImageMaxRewritesAtOnce (im) 12 InPlaceResourceOptimization (ipro) True LoadFromFileCacheTtlMs (lfct) 60000000 LogDir (ald) /var/log/pagespeed LRUCacheByteLimit (alcb) 16384 LRUCacheKbPerProcess (alcp) 1024 MaxCacheableContentLength (rcl) 16777216 MemcachedServers (ams) localhost:11211 PreserveUrlRelativity (pur) True RespectVary (rv) True RewriteLevel (l) Optimize For Bandwidth ShmMetadataCacheCheckpointIntervalSec (smci) 300 SslCertDirectory (assld) /etc/ssl/certs StatisticsLogging (asle) True

Domain Lawyer http://*.celacp.org/ Auth http://*.googletagmanager.com/ Auth http://*.ssl-images-amazon.com/ Auth http://45.79.198.124/ Auth https://*.celacp.org/ Auth https://*.googletagmanager.com/ Auth https://*.ssl-images-amazon.com/ Auth https://biblioteca-admin.celacp.org/ Auth https://biblioteca.celacp.org/ Auth https://upload.wikimedia.org/ Auth

Invalidation Timestamp: (none)

The headers of my main page are

Server Apache X-Frame-Options SAMEORIGIN SAMEORIGIN Content-Security.Policy default-src https: 'upgrade-insecure-requests' 'unsafe-eval' 'unsafe-inline'; object-src 'none' X-Content-Security.Policy default-src https: 'upgrade-insecure-requests' 'unsafe-eval' 'unsafe-inline'; object-src 'none' Strict-Transport-Security max-age=31536000; includeSubDomains max-age=3153600 Cache-Control max-age=0, no-cache Pragma no-cache Content-Script-Type text/javascript Content-Style-Type text/css X-Mod-Pagespeed 1.13.35.2-0 Vary Accept-Encoding,User-Agent Set-Cookie CGISESSID=23302c50afaebc5c8b9f777556a72297; path=/; HttpOnly;HttpOnly;Secure X-Content-Type-Options nosniff X-XSS-Protection 1; mode=block Content-Length 219221 Content-Type text/html; charset=UTF-8

Lofesa commented 4 years ago

First, you can´t auth any domain that don´t run pagespeed module, so many of the domains you have auth are bad, Example, I´m sure wikimedia is not running pagespeed module. What happens when you auth these domain? Pagespeed module rewrites the resource (some like https://upload.wikimedia.org/some-path/some-file.jpg.ic.pagespeed.SOMEHASH.webp) but these rewrited resource don´t exists so you get a 404 in return. More. You have http://.celacp.org/ Auth but latter you have https://biblioteca-admin.celacp.org/ Auth, this make me thinking your own domain is celacp.org and are served by https, but you have http. Try to use http*://*.celacp.org in this way you have auth both versions http and https, and all subdomains from celacp.org, don´t need separate directives for biblioteca and bilioteca-admin subdomains. You have set SslCertDirectory (assld) /etc/ssl/certs but don´t have enabled the https fecht ModPagespeedFetchHttps enable, You must reviewd the docs about direct fecht https here: https://www.modpagespeed.com/doc/https_support#https_fetch . You have set LoadFromFileCacheTtlMs (lfct) 60000000, but don use LoadFromFile at all. This directive is an alternate way to load resources from disk other than load it fechting by a http request. If you will to use LoadFrom File you must read these doc: https://www.modpagespeed.com/doc/https_support#load_from_file and then enable selective file load https://www.modpagespeed.com/doc/domains#ModPagespeedLoadFromFile.

Any way... the message you have about restrictive cache headers is cause the resources pagespeed will rewrite (and optimize) have a cache-control header not public cacheable. So must have a cache-control: max-age=some value, public. Your resources don´t have any cache-control header, and the main request haveCache-Control: post-check=0, pre-check=0, I think these header make the request not public cacheable.

When su use debug filter, the debug messages are NOT logged in any log file, nor apache nor pagespeed. The debug messages are in the html code of the page, to read it you need to view the page code

ajconejo commented 4 years ago

Hi Lofesa

Thanks for your time with a detailed answer.

I´ve edited the config file and remove the domains as shown below. As per the docs FecthHttps is enabled by default. I´ve manually enable it though LoadFromFiles is enabled as this: ModPagespeedLoadFromFile "https://biblioteca.celacp.org/plugin/" "/var/lib/koha/biblioteca/plugins/" ModPagespeedLoadFromFile "https://biblioteca.celacp.org/opac-tmpl/" "/usr/share/koha/opac/htdocs/opac-tmpl/" ModPagespeedLoadFromFile "https://biblioteca-admin.celacp.org/intranet-tmpl/" "/usr/share/koha/intranet/htdocs/intranet-tmpl/" ModPagespeedLoadFromFileRuleMatch disallow .* ModPagespeedLoadFromFileRuleMatch allow .(ico|pdf|swf|eot|woff|ttf|otf|css|js|jpeg|jpg|png|gif|svg|svgz|mpg|mpeg|mp3|m4a|m4v|mp4|ogg|wmv|mov|mng|3gpp|3g p|webp|webm|flv|avi|asx|asf)$

About the Cache-Control... have no idea where it come from. I use mod_expire, but and have added this header in apache main config file:

Header append Cache-Control "public, must-revalidate "

Still with the modifications, I have a lots of missings and cache errors.

any other idea?

This is the new config file:

Version: 14: on

Filters gp Convert Gif to Png jp Convert Jpeg to Progressive jw Convert Jpeg To Webp pj Convert Png to Jpeg db Debug di Delay Images ec Cache Extend Css ei Cache Extend Images es Cache Extend Scripts hw Flushes html ji Inline Javascript io In-place optimize for browser js Jpeg Subsampling co Outline Css rj Recompress Jpeg rp Recompress Png rw Recompress Webp cf Rewrite Css jm Rewrite External Javascript jj Rewrite Inline Javascript cp Strip Image Color Profiles md Strip Image Meta Data

Options DefaultSharedMemoryCacheKB (dsmc) 50000 EnableRewriting (e) 1 FetchHttps (fhs) enable FetchWithGzip (afg) True FileCacheCleanIntervalMs (afcci) 3600000 FileCacheInodeLimit (afcl) 500000 FileCachePath (afcp) /var/cache/mod_pagespeed/ FileCacheSizeKb (afc) 409600 HonorCsp (hcsp) True ImageMaxRewritesAtOnce (im) 12 InPlaceResourceOptimization (ipro) True LoadFromFileCacheTtlMs (lfct) 60000000 LogDir (ald) /var/log/pagespeed LRUCacheByteLimit (alcb) 16384 LRUCacheKbPerProcess (alcp) 1024 MaxCacheableContentLength (rcl) 16777216 MemcachedServers (ams) localhost:11211 PreserveUrlRelativity (pur) True RespectVary (rv) True RewriteLevel (l) Optimize For Bandwidth ShmMetadataCacheCheckpointIntervalSec (smci) 300 SslCertDirectory (assld) /etc/ssl/certs StatisticsLogging (asle) True

Domain Lawyer http://.celacp.org/ Auth http*://45.79.198.124:8080/ Auth

Invalidation Timestamp: (none)

Lofesa commented 4 years ago

HI At a first glance: http*://.celacp.org/ Auth this do´t work. You have missed a *. http*://*.celacp.org/ Goolge tag manager domain still auth. css and js files are served with Cache-Control: max-age=43200, pulic, proxy-revalidate, must-revalidate try to take out proxy-revalidate and must-revalidate Most of the images are in wdp format, these images can´t be optimized by pagespeed, no one other than png,jpg or gif images can be optimized by pagespeed. Can you change to jpg format? You have a carousel of images where the src is a script, like src="/cgi-bin/koha/opac-image.pl?biblionumber=244612", images loaded by a script are no optimized by pagespeed. Pagespeed only optimize images that are in the html code and are in the png,jpg,gif formats. Can you change the script with an image url? Most of the png images can´t converted to jpg/webp because are sensitive to compresion noise or have transparent pixels... Can you change it to jpg?

ajconejo commented 4 years ago

Hi, This is getting me nutz

I´ve corrected the http://.celacp.org I´ve corrected the cache-control and the full header is: GENERAL INFO: Request URL: https://biblioteca.celacp.org/opac-tmpl/bootstrap/lib/bootstrap/js/bootstrap.min_19.1108000.js Request Method: GET Status Code: 304 Not Modified Remote Address: 45.79.198.124:443 Referrer Policy: no-referrer-when-downgrade RESPONSE HEADERS Cache-Control: public, max-age=600 Connection: Keep-Alive Date: Wed, 26 Aug 2020 03:04:20 GMT Expires: Thu, 26 Aug 2021 03:04:20 GMT Keep-Alive: timeout=5, max=98 Server: Apache Vary: User-Agent REQUEST HEADER Accept: /* Accept-Encoding: gzip, deflate, br Accept-Language: es-ES,es;q=0.9,en;q=0.8,fr;q=0.7 Connection: keep-alive Cookie: _ga=GA1.2.33496248.1546890164; __utmz=190918968.1595452432.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); _gid=GA1.2.1996162780.1598227122; KohaOpacLanguage=en; CGISESSID=6afc6b9dcda9bad747188c36b1f4acfb DNT: 1 Host: biblioteca.celacp.org If-Modified-Since: Fri, 24 Jul 2020 06:41:48 GMT Referer: https://biblioteca.celacp.org/cgi-bin/koha/opac-main.pl Sec-Fetch-Dest: script Sec-Fetch-Mode: no-cors Sec-Fetch-Site: same-origin User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.135 Safari/537.36

And still getting almost 100% ressources not rewritten due to cache-control header restriction

I´ve added again googletag since I got a lot of "errors" on that domain and complicate the log reading. Will remove it once I get this to work.

Most Images are png. I ve only 6 wdp.

About the carousell, I can not change it. IPRO is not supposed to handle those types of images? Also the log shows several lines similar to: Cache entry is expired: https://biblioteca.celacp.org/cgi-bin/koha/opac-image.pl?biblionumber=244592&thumbnail=1 (fragment=celacp.org) and later... HTTPCache key=https://biblioteca.celacp.org/cgi-bin/koha/opac-image.pl?biblionumber=244592&thumbnail=1 fragment=celacp.org: remembering recent failure for 296 seconds.

How come an image that was in cache, does fail to re-cache? I get this for almost all of these ressources.

I´ve enabled ForceCache and extent_cache filter but get no changes.

This is my current config:

Version: 14: on

Filters ij Canonicalize Javascript library URLs gp Convert Gif to Png jp Convert Jpeg to Progressive jw Convert Jpeg To Webp pj Convert Png to Jpeg db Debug di Delay Images ec Cache Extend Css ei Cache Extend Images es Cache Extend Scripts hw Flushes html ji Inline Javascript io In-place optimize for browser js Jpeg Subsampling co Outline Css rj Recompress Jpeg rp Recompress Png rw Recompress Webp cf Rewrite Css jm Rewrite External Javascript jj Rewrite Inline Javascript cp Strip Image Color Profiles md Strip Image Meta Data

Options DefaultSharedMemoryCacheKB (dsmc) 50000 EnableRewriting (e) 1 FetchHttps (fhs) enable FetchWithGzip (afg) True FileCacheCleanIntervalMs (afcci) 3600000 FileCacheInodeLimit (afcl) 500000 FileCachePath (afcp) /var/cache/mod_pagespeed/ FileCacheSizeKb (afc) 409600 HonorCsp (hcsp) True ImageMaxRewritesAtOnce (im) 12 InPlaceResourceOptimization (ipro) True LoadFromFileCacheTtlMs (lfct) 60000000 LogDir (ald) /var/log/pagespeed LRUCacheByteLimit (alcb) 16384 LRUCacheKbPerProcess (alcp) 1024 MaxCacheableContentLength (rcl) 16777216 MemcachedServers (ams) localhost:11211 PreserveUrlRelativity (pur) True RespectVary (rv) True RewriteLevel (l) Optimize For Bandwidth ShmMetadataCacheCheckpointIntervalSec (smci) 300 SslCertDirectory (assld) /etc/ssl/certs StatisticsLogging (asle) True

Domain Lawyer http://.celacp.org/ Auth http://45.79.198.124:8080/ Auth https://.googletagmanager.com/ Auth https://*.ssl-images-amazon.com/ Auth

Invalidation Timestamp: (none)

Any help is really appreciate

Lofesa commented 4 years ago

Well... the 1st step to debug is putting all things in the rigth side. png images still not full optimized becasue it have transparent pixels or are sensitive to compression noise, but are optimized by IPRO. You can see it because the eTag header: Etag: W/"PSA-aj-nbqpw_qrXQ-br" , these etag header is set by IPRO, you can se that Conten-legth and X-Content-legth mismacht, these are the result of partial optimization, IPRO don´t rewrite the url.

About the js and css files, can you try to set: ModPagespeedModifyCachingHeaders on

And find from where comes this header (see the image), there is a no-cache header Captura

ajconejo commented 4 years ago

Hi,

Have restarted mod_pagespeed config from scratch and found that the header issue and cache misses comes from the LoadFromFile... might be a bug?

I didn´t changed anything on the cache-control headers on my pages and when replacing:

ModPagespeedLoadFromFile "https://biblioteca.celacp.org/opac-tmpl/bootstrap/images/" "/usr/share/koha/opac/htdocs/opac-tmpl/bootstrap/images/" ModPagespeedLoadFromFile "https://biblioteca.celacp.org/opac-tmpl/bootstrap/itemtypeimg/" "/usr/share/koha/opac/htdocs/opac-tmpl/bootstrap/itemtypeimg/"

With

ModPagespeedLoadFromFile "https://biblioteca.celacp.org/opac-tmpl/bootstrap/" "/usr/share/koha/opac/htdocs/opac-tmpl/bootstrap/"

I start getting the cache-control errors and caches mises

In both cases I have:

ModPagespeedInPlaceResourceOptimization on ModPagespeedLoadFromFileRuleMatch disallow .* ModPagespeedLoadFromFileRuleMatch allow .(ico|pdf|swf|eot|woff|ttf|otf|css|js|jpeg|jpg|png|gif|svg|svgz|mpg|mpeg|mp3|m4a|m4v|mp4|ogg|wmv|mov|mng|3gpp|3gp|webp|webm|flv|avi|asx|asf)$

This is my current full config:

Version: 14: on

Filters ah Add Head ij Canonicalize Javascript library URLs cc Combine Css jc Combine Javascript gp Convert Gif to Png jp Convert Jpeg to Progressive jw Convert Jpeg To Webp mc Convert Meta Tags pj Convert Png to Jpeg ws When converting images to WebP, prefer lossless conversions db Debug di Delay Images ec Cache Extend Css ei Cache Extend Images es Cache Extend Scripts fc Fallback Rewrite Css if Flatten CSS Imports hw Flushes html ci Inline Css ii Inline Images il Inline @import to Link ji Inline Javascript io In-place optimize for browser js Jpeg Subsampling rj Recompress Jpeg rp Recompress Png rw Recompress Webp ri Resize Images cf Rewrite Css jm Rewrite External Javascript jj Rewrite Inline Javascript cu Rewrite Style Attributes With Url cp Strip Image Color Profiles md Strip Image Meta Data

Options CacheFlushPollIntervalSec (acfpi) 1296000 EnableCachePurge (euci) True EnableRewriting (e) 1 FetchWithGzip (afg) True FileCacheCleanIntervalMs (afcci) 36000000 FileCacheInodeLimit (afcl) 500000 FileCachePath (afcp) /var/cache/mod_pagespeed/ FileCacheSizeKb (afc) 409600 HttpCacheCompressionLevel (hccl) 9 InPlaceResourceOptimization (ipro) True LoadFromFileCacheTtlMs (lfct) 600000 LogDir (ald) /var/log/pagespeed LRUCacheByteLimit (alcb) 16384 LRUCacheKbPerProcess (alcp) 1024 MaxCacheableContentLength (rcl) 20000000 MaxCombinedCssBytes (xcc) -1 MaxCombinedJsBytes (xcj) -1 MemcachedServers (ams) localhost:11211 PrivateNotVaryForIE (pnvie) True PurgeMethod (pm) PURGE RateLimitBackgroundFetches (rlbf) False RewriteRandomDropPercentage (rrdp) -1 ShmMetadataCacheCheckpointIntervalSec (smci) 300 SslCertDirectory (assld) /etc/ssl/certs StatisticsLogging (asle) True

Domain Lawyer http://.celacp.org/ Auth http://.googletagmanager.com/ Auth http://.ssl-images-amazon.com/ Auth http*://45.79.198.124/ Auth

Invalidation Timestamp: (none)

Lofesa commented 4 years ago

But now you have it working... Some js files are not rewrited because they are loaded by other js file. Example https://biblioteca.celacp.org/opac-tmpl/bootstrap/js/script_19.1108000.js is loaded by https://biblioteca.celacp.org/opac-tmpl/bootstrap/lib/modernizr.min_19.1108000.js.pagespeed.jm.Yq80J6k3xh.js but get some type of optimization as far as their Etag: W/"PSA-aj-6kR3KjB27V". The PSA denotes the IPRO work. Same with the css file https://biblioteca.celacp.org/plugin/Koha/Plugin/Com/ByWaterSolutions/CoverFlow/bower_components/jquery-flipster/dist/jquery.flipster.min.css

Whit some images, like wikipedia-16.png, are in a <picture> srcset element, pagespeed don´t rewrite these until a next version, when the PR #1929 get merged. With images loaded by a .pl script, some type of optimization is done, as far as I can see the W/"PSA-aj-bpsLvimpe6-br" and Content-Length: 4951 X-Original-Content-Length: 49569 for https://biblioteca.celacp.org/cgi-bin/koha/opac-image.pl?biblionumber=238112&thumbnail=1. Here the PSA in the Etag header is set by IPRO.

ajconejo commented 4 years ago

I´ve tuned the load from file paths and that seems to fix the issue. Thanks to all who help