apache / incubator-pagespeed-ngx

Automatic PageSpeed optimization module for Nginx
http://ngxpagespeed.com/
Apache License 2.0
4.36k stars 365 forks source link

Varnish with PageSpeed module #1674

Open michael-rubel opened 4 years ago

michael-rubel commented 4 years ago

If configuring the stack Nginx (SSL 443) -> Varnish (6081) -> Nginx with PageSpeed (8080), how to serve .pagespeed. rewritten files? As I see Varnish returning the page with URLs rewritten but it's not found physically. Looks like it stays at 8080 VirtualHost that unavailable from public net. Any solutions to workaround this?

Lofesa commented 4 years ago

Have you configured pagespeed + dowmstream cache? Here config for Varnish 3 and 4 and for nginx acting as a proxy server too.

I never used this config, so my advice maybe not accurate. I see a main issue: Most of the optimizations depents of the browser UA and part of the UA string is in the pagespeed cache key, so you need to assure that the UA from the final user traverse all the chain and reaches pagespeed.

And varnish need to store all the optimized variants for a resource. I have found a varnish config file, but I think is for a nginx (ssl443) + pagespeed -> varnish -> nginx config.

Hope this help.

michael-rubel commented 4 years ago

I tried this config but it does not help in my case. All the things from Downstream docs are already done but still the problem with the rewritten assets. I don't know, maybe something configured wrong.

But as you described the flow, maybe only possible way to do it by putting PageSpeed in a front of Varnish (i.e. on SSL 443 VirtualHost)? I tried this and it works perfect, but it cost too much. Varnish without PageSpeed is 2400 RPS in my project, but if PageSpeed turned on the front of it something like only 110 RPS... That's why I asked about running PageSpeed behind Varnish (on 8080 host).

Lofesa commented 4 years ago

Maybe this can help #787 In this thread yuou can found a varnish config, and update to the downstream cache config docs that are undocumented and a sugestion to use LoadFromFile in the pagespeed conf file.

EDIT: You can find more info google¡ing "mod_pagespeed varnish"

michael-rubel commented 4 years ago

I already tried all of those things, but I'm still receiving rewritten assets .pagespeed. that unavailable for the client. LoadFromFile does not help me in this case.

[screen]

I tried to PreserveURLs, but seems like it doesn't work: [link]

Lofesa commented 4 years ago

You can get the file w/o rewrite? The server where the file is, is running pagespeed? I will say, most people do a pagespeed Domain Other.domain.com but in this domain, no pagespeed module is running.

For example, suposse you domain is mydomain.com and the hml is served from here. in pagespeed you have set: pagespeed Domain mydomain.com; (This run pagespeed) pagespeed Domain otherdomain.com; (This not)

and in the html you include a resource from otherdomain.com, this resource get rewrited, cause their domain is authorized, but the rewrited resource don´t exists, cause is not running pagespeed.

michael-rubel commented 4 years ago

So, you trying to say that the only way to run PageSpeed is by putting it on SSL Termination Vhost (443) before Varnish? Because I did it only on non-SSL Vhost (8080), thinking that Varnish can cache assets as it caches the main page? Or I misunderstand something? Or if you mean it may work on both sides, I can't imagine that, because it will be incredible overhead and doing that is pointless.

Lofesa commented 4 years ago

You can get the file w/o rewrite?

With this I´m asking if you are able to request https://somedomain.com/media/magiccart/magicslider/p/o/pomyslnaprezent.png In this request you bypass all the pagespeed stuff. If you are unable to do it, pagespeed is not the problem.

The server where the file is, is running pagespeed? I will say, most people do a pagespeed Domain Other.domain.com but in this domain, no pagespeed module is running.

For example, suposse you domain is mydomain.com and the hml is served from here. in pagespeed you have set: pagespeed Domain mydomain.com; (This run pagespeed) pagespeed Domain otherdomain.com; (This not)

and in the html you include a resource from otherdomain.com, this resource get rewrited, cause their domain is authorized, but the rewrited resource don´t exists, cause is not running pagespeed.

This all is not related with the config you are using. Most people do a bad config of pagespeed and put, for example:

pagespeed Domain mydomain.com; pagespeed Domain youtube.com;

Then, url from youtube in the html code, in say index.html, is rewriten in some like https://www.youtube.com/someimage.jpg.pagespeed.ce.HASH.webp but youtube site respond with a 404. So I´m asking if these image that get the 404 error is in a domain other than the main html, and if yes, if in these domain pagespeed is running.

michael-rubel commented 4 years ago

Image is upon domain I'm using, but PageSpeed is turned off on 443 Vhost and turned on 8080.

https://exmaple.com:443 (public SSL, proxies to Varnish) http://127.0.0.1:6081 (Varnish returns to the backend 8080) http://example.com:8080 (local backend with PageSpeed)

Lofesa commented 4 years ago

And are you able to request the image url? https://somedomain.com/media/magiccart/magicslider/p/o/pomyslnaprezent.png

michael-rubel commented 4 years ago

Yes. [link]

Lofesa commented 4 years ago

Well, I see this a test subdomain, so not in production, rigth? I will say, you can do changes w/o users disruption. If I were you, I would divide the problem into other minors.

1.- You make public the web on port 8080, so that, as if you were an end user, you could send requests. Or try another way to do request on it, like wget or curl This is where you would have nginx + pagespeed, and in principle it should work w/o problems. 2.- If this step is ok, then intruduce varnis: varnis -> nginx(8080)+pagespeed, and see what happens. 3.- when the previous step is ok, then introduce the ssl termination.

EDIT: When a request have the .pagespeed. in the url.... What is the path she travels in nginx? I want to say locations that can capture this request. For a brief moment I have seen an error of php files when I tried to open a rewritten image

EDIT 2: Whats happen if you don´t have varnish in the midle? nginx SSL termination -> nginx 8080 + pagespeed. It work?

michael-rubel commented 4 years ago

It works on [link] but in the [link] it requests 443 domain, not 8080. Is there a workaround to have it under a public domain?

michael-rubel commented 4 years ago

So, on 8080 it works: [link] On direct Varnish: [link] it returns domain (test.onlinemagento.com) is not authorized, while pagespeed Domain test.onlinemagento.com; is added. On 443 it's not found: [link]

Lofesa commented 4 years ago

Uummm when you add pagespeed Domain yourdomain.com that defaults to the port 80 and if you use the https (pagespeed Domain https://mydomain.com) default to the 443 port. Maybe adding the port is needed, I don´t know cause never used ports other than defaults. but:

pagespeed Domain http://.domain.com: *

I have found this: https://github.com/apache/incubator-pagespeed-mod/blob/b4bf44cc56d8bbf17494c540dfb6ef20dfcf5073/html/doc/announce-sec-update-201603.html#L105

si the * is permited in the port side.

Make the change and try again the tree options.

michael-rubel commented 4 years ago

Thank you, now the domain is authorized, but :6081 Varnish side is requesting some assets over https.

UPD: It looks like [link] is working, but on 443 still not found.

Lofesa commented 4 years ago

Thank you, now the domain is authorized, but :6081 Varnish side is requesting some assets over https.

Maybe these assets are bypassing varnish? I will say, request that varnish send to the backend w/o any change on it.

UPD: It looks like http://test.onlinemagento.com:6081/?PageSpeedFilters=+debug is working, but on 443 still not found.

So now if you request directly the varnish it work, rigth? Then the next step is debug the comunication between ngin ssl to varnish. Whats happens if the nginx ssl become non ssl? For testing purposes.

Ummm I tried to fecht this: https://test.onlinemagento.com/static/frontend/Sm/ogalo/pl_PL/css/A.print.css.pagespeed.cf.gWpod0Sgg1.css

and seems that is magento that tries to search a file:

Unable to resolve the source file for 'frontend/Sm/ogalo/pl_PL/css/A.print.css.pagespeed.cf.gWpod0Sgg1.css'
<pre>#1 Magento\Framework\App\StaticResource->launch() called at [vendor/magento/framework/App/Bootstrap.php:257]
#2 Magento\Framework\App\Bootstrap->run() called at [pub/static.php:13]
</pre>

P.D.: I´m new to this stuff, never have tried this config nor used varnish, so I´m blind in this issue and maybe I´m proposing things that don´t make sense. Sorry for that.

michael-rubel commented 4 years ago

I think Magento can't find it just because of Varnish that didn't cache these assets to get it back to SSL Vhost.

Lofesa commented 4 years ago

But when you hit varnish the asset is here, so varnish cached it.... and when you hit nginx 8080 it manages to resolve the .pagespeed. thing, no magento is involved to return the asset. Recap on this: 1.- The client send a https request to nginx SSL 2.- The nginx SSL do a http request to varnish (is a redirect or a proxy_pass?) 3.- The varnish do a http request to the nginx 8080. You have tried: 1.- Hiting nginx 8080 and it work w/o issues 2.- Hiting varnish and it works w/o issue ( is rigth?) 3.- Hiting nginx SSL don´t work.

So the questions I see here: 1.- Is the nginx SSL doing a http request to varnish? Seems that yes cause if not the non rewriten resources returns a 404 too. 2.- So why the .pagespeed. get the 404? Maybe these resources have https when go to varnish and/or varnish send it untouched to nginx 8080? 3.- Why magento try to find a file with .pagespeed. ? This request sould be catched by pagespeed module and do the stuff to search it in the pagespeed cache.

Have you tried my last sugestion?

Whats happens if the nginx ssl become non ssl? For testing purposes.

Make the nginx SSL respond to the http 80 w/o redirect to the 443 and sending the request to varnish and see whats happens.

EDIT: Take a look at varnishlog this command uotput the varnish log. Is a command you must run as roor ( sudo) and it output what varnish is doing. With -w it stores the log in a file.

michael-rubel commented 4 years ago

SSL host is using proxy_pass. This is my Vhosts:

[config.zip]

michael-rubel commented 4 years ago

The problem is the forwarding of the assets from Varnish to SSL Vhost. It looks like Varnish returning back only the page you joined, but not static files.

This one works: [link]

But this one, not: [link]

Lofesa commented 4 years ago

Maybe you know that, but.... I have read that varnish don´t store things with cookies and your static assets are server with cookies. And I have found a site that maybe you need to take a look on it: https://www.getpagespeed.com/ In their blog have articles about varnish like: https://www.getpagespeed.com/server-setup/varnish-static-files-cache

The problem is the forwarding of the assets from Varnish to SSL Vhost. It looks like Varnish returning back only the page you joined, but not static files

So if varnish don´t have stored the asset it must forward the request to the backend server and then this must reply . But seems that when a .pagespeed. request comes from varnish to the backend, magento tries to find it in filesystem before pagespeed catch it.

michael-rubel commented 4 years ago

But seems that when a .pagespeed. request comes from varnish to the backend, magento tries to find it in filesystem before pagespeed catch it.

These assets will work with Magento if I'll turn on PageSpeed on front SSL Vhost for example and it works here as you can see: [link]

It's still Magento.

michael-rubel commented 4 years ago

In their blog have articles about varnish like: https://www.getpagespeed.com/server-setup/varnish-static-files-cache

This article is about caching static files on the hard drive instead of RAM, but it's not about how to cache it at all. Theoretically, assets have to be cached automatically in RAM by default.

Lofesa commented 4 years ago

These assets will work with Magento if I'll turn on PageSpeed on front SSL Vhost for example and it >works here as you can see: http://test.onlinemagento.com:6081/static/frontend/Sm/ogalo/pl_P/css/A.print.css.pagespeed.cf.gWpod0Sgg1.css

I can´t see it, cause you have a redirection from http to https.

These assets will work with Magento if I'll turn on PageSpeed on front SSL Vhost for example and it works here as you can see: http://test.onlinemagento.com:6081/static/frontend/Sm/ogalo/pl_PL/css/A.print.css.pagespeed.cf.gWpod0Sgg1.css

Yes, but you can see in it how cookies are removed. The statics assest where served with cookies, are these added by nginx SSL?

Have you tried to remove the ssl? I wil say, not the nginx server in front of varnish, but w/o using ssl. To discard the problem is the ssl himself.

EDIT: The rationale to try this (no ssl) is that I think the protocol (http/https) is part of the key that pagespeed cache uses.

michael-rubel commented 4 years ago

I can´t see it, cause you have a redirection from http to https.

Oрen it in the incognito window.

EDIT: The rationale to try this (no ssl) is that I think the protocol (http/https) is part of the key that pagespeed cache uses.

I'll try.

Lofesa commented 4 years ago

As I have supposed... Have you configured that: pagespeed RespectXForwardedProto on;

michael-rubel commented 4 years ago

As I have supposed... Have you configured that: pagespeed RespectXForwardedProto on;

I removed that, now the SSL Vhost trying to request backend with рort 8080 directly and by HTTРS to get these assets.

michael-rubel commented 4 years ago

See, the guy tried to do the same thing as me and got failed at the same рroblem, then just included it on the front SSL Vhost: https://github.com/apache/incubator-pagespeed-ngx/issues/1547

But at the front, using РageSрeed with Varnish is nearly useless due to CРU overhead.

Lofesa commented 4 years ago

My bet is for a protocol issue. Have you tried to do all http only?

michael-rubel commented 4 years ago

My bet is for a protocol issue. Have you tried to do all http only?

Yeah, I tried this. Got the same result over HTTP.