apache / incubator-pagespeed-mod

Apache module for rewriting web pages to reduce latency and bandwidth.
http://modpagespeed.com
Apache License 2.0
696 stars 157 forks source link

Customize which filters get applied in which contexts to which resources #689

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
There are times we want a filter in general, but for it to not apply in a 
specific situation.

Of course we can exclude resources via Disallow, but this only applies to 
resource-rewriting, and we don't look at the Disallow list in all the cases 
where it might be needed, such as trim_urls.  Further, we might want to avoid 
applying filter A to resource X, but still apply filter A to other resources, 
and still apply filter B to X.

Finally, we might wish to customize the contexts in which a resource is 
optimized.  E.g. we might wish to suppress trim_urls for <form> actions.

Original issue reported on code.google.com by jmara...@google.com on 8 May 2013 at 12:52

GoogleCodeExporter commented 9 years ago

Original comment by jmara...@google.com on 8 May 2013 at 12:52

GoogleCodeExporter commented 9 years ago
Issue 694 has been merged into this issue.

Original comment by jmara...@google.com on 13 May 2013 at 2:26

GoogleCodeExporter commented 9 years ago
From Issue 694:

There are many sites that for logged-in users will use cookie-based 
authentication in a dynamic script to generate in-page content.

For example,

<img src="/dynamicscript.php?foo=bar" />

Where dynamicscript.php checks for an authentication cookie before returning 
the image in question. Enabling MapRewriteDomain breaks this, as replacing the 
above with something like

<img src="http://mycdn.com/dynamicscript.php?foo=bar" />

Will not necessarily result in the same behavior. If I'm not mistaken, there is 
currently no way to prevent this without explicitly enumerating all possible 
configurations except for /dynamicscript.php as separate MapRewriteDomain rules.

I understand it is not desirable to switch everything over to re2 because 
there's a lot more cognitive overhead involved verses simple wildcards, but 
perhaps there is another solution (maybe MapRewriteDomainRe2)?

Thanks.

Original comment by jmara...@google.com on 13 May 2013 at 2:26

GoogleCodeExporter commented 9 years ago

Original comment by jmara...@google.com on 13 May 2013 at 2:27

GoogleCodeExporter commented 9 years ago
I have brought a similar issue up around the scope of lazyload_images or other 
resource filters. I think one of the issues is that all the scope-ing is done 
on the HTML page and not resource level.

In the case of lazyload_images for example we can see how a site may find this 
useful for some images (for example thumbnail enumeration) but not _all_ images 
(when it would often be a nuisance).

Having a VHost Directory/Location scope of setting lazyload_images, or perhaps 
a a way to have regex filters when enabling lazyloading would be great.

I also believe that resource filter scope-ing would also be beneficial to CSS, 
JS and other resources.

I have not looked at how mod_pagespeed hooks into Apache/nginx to see how easy 
is to implement this at VHost location level, however, I would imagine a global 
setting using a regex when enabling the filter would be reasonably easy to 
implement.

One other note on lazyload_images - there is a theoretical issue in the current 
implementation around search engine crawlers which do not get to actually fetch 
the images out of the img tags. This is obviously not great. I have started a 
conversation around this offering an alternative mechanism - perhaps you may 
find it useful - 
https://groups.google.com/forum/?hl=en#!searchin/mod-pagespeed-discuss/lazyload$
20luci/mod-pagespeed-discuss/S_8U_cD5X44/-kxUkGBDypEJ

Original comment by l...@aura.travel on 23 Jul 2013 at 12:11

GoogleCodeExporter commented 9 years ago
Issue 1047 has been merged into this issue.

Original comment by sligocki@google.com on 5 Feb 2015 at 4:35

GoogleCodeExporter commented 9 years ago
Copying comment from Issue 1047:

It would be useful to allow/disallow wildcard patterns on a per-filter basis.

Example from mod-pagespeed-discuss:

I am using mod_pagespeed on Apache 2.4.7, everything is working fine but I 
don't know how to configure it to do what I want.

I have one domain www.mydomain.com that I am sharding on 
static1.mydomain.com,static2.mydomain.com,static3.mydomain.com and where I have 
rewrite_images enabled.

I have another domain coming from a cdn provider www.example-cdn.com that I use 
in my web at www.mydomain.com, for this cdn domain I am using 
ModPagespeedDisallow to avoid having pagespeed rewrite the images coming from 
that cdn.

Ok, all of this is working fine, the problem is, How could I shard 
www.example-cdn.com in my html if I don't want to rewrite images coming from 
the cdn but I want to rewrite_images on my own domain?

The problem I find it is that if I use ModPagespeedDisallow www.example-cdn.com 
I can't shard access to it and if I delete this disallow all the cdn's coming 
images will be rewritten if these images can be improved by pagespeed.

See a similar issue for defer_javascript in Issue 481.

Original comment by jmara...@google.com on 5 Feb 2015 at 4:38