apache / incubator-pagespeed-mod

Apache module for rewriting web pages to reduce latency and bandwidth.
http://modpagespeed.com
Apache License 2.0
696 stars 158 forks source link

Content Security Policy support. #876

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Hello, lovely PageSpeed folks.

We're in the process of pushing CSP 1.1[1] through to CR. I think it has a few 
features which would be quite usefully injected via PageSpeed. For example, 
hashes and nonces would allow script/style to be inlined for performance while 
maintaining protections against XSS.

More generally, PageSpeed understands the resources on a page, and could fairly 
easily construct a CSP to deliver along with the page to protect against 
DOM-based XSS attacks.

I'd love to chat with you folks about these options. :)

-mike

[1]: 
http://w3c.github.io/webappsec/specs/content-security-policy/csp-specification.d
ev.html

Original issue reported on code.google.com by mkwst@chromium.org on 29 Jan 2014 at 6:43

GoogleCodeExporter commented 9 years ago

Original comment by igrigo...@google.com on 29 Jan 2014 at 6:52

GoogleCodeExporter commented 9 years ago
Assigning to Jan for now.

Original comment by jmara...@google.com on 10 Feb 2014 at 9:02

GoogleCodeExporter commented 9 years ago
Some notes for reference:
* We could have a mode where we blacklist all inline scripts we see in the 
page, and whitelist any scripts we choose to inline.
* We have a way for content owners to black / whitelist particular domains for 
rewriting; this is a starting point for the sites we trust in a CSP, but might 
omit stuff like social buttons that we legitimately want to stay on the third 
party site where they're already hosted.
* The nonce and hash specs for inline JavaScript are very very relevant here: 
http://w3c.github.io/webappsec/specs/content-security-policy/csp-specification.d
ev.html#nonce-usage-for-script-elements

Basically, we can use either of these on the incoming response to whitelist 
scripts, and we can use a nonce on the outgoing response to indicate which 
inline scripts we're blessing (eg because we inlined or combined them).

We'll also want to hook the pingback to the site to catch and report errors, 
too.  See 
http://www.html5rocks.com/en/tutorials/security/content-security-policy/

Original comment by jmaes...@google.com on 10 Feb 2014 at 10:30

GoogleCodeExporter commented 9 years ago
Mike: is the Nonce CSP support built into Chrome yet?

Original comment by jmara...@google.com on 4 Aug 2014 at 2:47

GoogleCodeExporter commented 9 years ago
Yes. I'm fairly sure both nonces and hashes are out with Chrome 36 in stable 
right now.

Original comment by mkwst@chromium.org on 4 Aug 2014 at 5:19

GoogleCodeExporter commented 9 years ago
Ping. :)

Original comment by mkwst@google.com on 20 Sep 2014 at 6:11

GoogleCodeExporter commented 9 years ago
More votes for this bug:
https://plus.sandbox.google.com/114552443805676710515/posts/E9rbCiH3bmW

Original comment by jmara...@google.com on 24 Sep 2014 at 12:24

GoogleCodeExporter commented 9 years ago

Original comment by jmara...@google.com on 24 Sep 2014 at 9:18

GoogleCodeExporter commented 9 years ago
Brain dump on some requirements of adding nonce for inline scripts only:

- Should not be done unless there is an existing security policy, as otherwise 
it will break sites. This is actually annoying since it seems permitted in 
<meta> since we are parsing incrementally. Joy. (Fun cases: inline script 
before a meta that sets a more restrictive script policy; spec seems to imply 
it would be permitted before the meta but not after).
- Needs to understand the relation between default-src and script-src at least.
- If I am reading the spec correctly, may need it for any srcless <script> tags 
we dynamically construct. Also can't use any .on attributes. 

For other features:
- CombineJS won't work w/o unsafe-eval
- Any path whitelists may need to be altered, with a thread of expanding, which 
we really shouldn't do automatically; this includes both stuff like widening 
directories when doing CombineAccrossPaths, to much nastier cases when the 
whitelist is exact URLs --- we would have to generate authorization for 
.pagespeed. URLs, which we can only really predict up to directory, which is 
suboptimal. Actually, I guess we can't predict cross-path combinations at all, 
so they'll have to be forced off.
- Obviously any mapping directives may matter, too.
- If we optimize an inline script with hashes, may need to change the hash, 
except those may have already been sent off. Joy. Subtleties in whitespace 
handling in HTML may also matter for hash computation, but minification may 
make that moot.
- Don't think we can inline images w/o completely permitting data: for them. 
Probably shouldn't do that by default.
(This may also require copy of stuff default-src => image-src)
- base-uri restrictions can affect URI resolution.

Original comment by morlov...@google.com on 2 Oct 2014 at 7:52

GoogleCodeExporter commented 9 years ago
.on: the lazyload script will require some changes, since right now we include 
it as an onload per img.  Arguably we should be doing this globally anyway, 
except that it's hard / impossible to ensure prompt handling if you do that.

Original comment by jmaes...@google.com on 2 Oct 2014 at 10:22

jeffkaufman commented 8 years ago

Could we do combine_js with data urls instead of an inline script? Still need eval though.

morlovich commented 8 years ago

I don't think that should change how CSP applies. If it does, it's a bug in CSP spec.

On Mon, Nov 30, 2015 at 9:24 AM, Jeff Kaufman notifications@github.com wrote:

Could we do combine_js with data urls instead of an inline script? Still need eval though.

— Reply to this email directly or view it on GitHub https://github.com/pagespeed/mod_pagespeed/issues/876#issuecomment-160643980 .

jmarantz commented 8 years ago

We need eval to:

  1. run each script in the same DOM context as the original script tag.
  2. ensure that file2.js runs even if file1.js throws an exception. There may be other reasons for using eval; I don't remember.

OTOH it seems possible to implement a restricted form of combine_js which will pull together adjacent script tags by simply concatenating the files, and accepting the drawback that file2.js won't run if file1.js throws. Maybe as an option? This is possible but we don't have anyone available to work on this, and there are other issues with CSP as mentioned in the comment on "2 Oct 2014 at 7:52"

In the meantime, I think you can't have both combine_js and eval-protection; you have to choose.

moloch-- commented 8 years ago

CSP Level 2's nonce-src may be a potential middle ground, however as far as I know nonce-srcs can only be used to restrict unsafe-inline and there is no way to restrict unsafe-eval to only these blocks. Also of note, Pagespeed will also break Subresource Integrity unless it updates the corresponding hashes.

jeffkaufman commented 8 years ago

Pagespeed will also break Subresource Integrity unless it updates the corresponding hashes.

Resources with SRI should set Cache-Control: no-transform, right? In which case PageSpeed will leave them alone.

jeffkaufman commented 8 years ago

nonce-srcs can only be used to restrict unsafe-inline and there is no way to restrict unsafe-eval to only these blocks

Is this something we should ask to be added to the spec?

morlovich commented 7 years ago

Some design notes: https://github.com/pagespeed/mod_pagespeed/wiki/Design-Doc:-Brainstorming-PageSpeed-Optimization-Products-and-Content-Security-Policy

To highlight, the worst case for us is when CSS is affected by policy in HTML--- that sort of thing is really hard for us since: 1) It prevents in-place rewriting of CSS --- and we may not know when looking at CSS, if the CSP info is in HTML only. That means we can't really have out-of-the-box support for both default-on IPRO and CSP w/o major architectural innovations. 2) For from-HTML use, we need to encode the plicy info into the .pagespeed. URL. This one isn't so bad, but we need to be careful that our approximation of the policy is conservative.

There is a bit of a pragmatic upside, too: the actual CSP use in the wild seems to be considerably less tricky than it's capable off; most policies seem to be host-level, sometimes directory level. (Well, not so much of an upside since a lot of websites use unsafe-eval and the like).

stampycode commented 7 years ago

the actual CSP use in the wild

Is that statistic for mod_pagespeed users? Because I disabled mod_pagespeed as it compromised the security of my CSP. I expect the uptake of this mod by users who want to use CSP properly, will be significantly lower.

It stands to reason that most mod_pagespeed w/CSP users will have the unsafe-eval option enabled, otherwise mod_pagespeed will break their code...

morlovich commented 7 years ago

On Tue, Jan 31, 2017 at 8:22 AM, Tim Stamp notifications@github.com wrote:

the actual CSP use in the wild Is that statistic for mod_pagespeed users? Because I disabled mod_pagespeed as it compromised the security of my CSP. I expect the uptake of this mod by users who want to use CSP properly, will be significantly lower.

It stands to reason that most mod_pagespeed w/CSP users will have the unsafe-eval option enabled, otherwise mod_pagespeed will break their code...

No, it's based on top sites in HTTP archive. It's possible that (actual/potential) mod_pagespeed users actually have tighter policies, though, since the big sites tend to be less agile.

Dreamsorcerer commented 6 years ago

Looks like there is also a 'strict-dynamic' option, which may or may not be useful as an alternative method.

AngelDeaD commented 6 years ago

Any update on this? I have problem with lazy load option in pagespeed. It adds static js code with "onload=" which CSP does not allow, and I can not edit it because CSP adds it, so I can not add nonce or sha to it. It would be good, if authors can add web server variables like: webpagespeed-nonce and webpagespeed-sha

If both of them are empty add js code that is added now, if one of them is empty, append another that is not empty to js code and then add it to page source.

oschaaf commented 6 years ago

@AngelDeaD could you add a concrete example of how the module would rewrite its input to illustrate your goal?

AngelDeaD commented 6 years ago

This is line that I have problem with: <p><img class="posted_image zoomable" src="https://www.XXXX.XXX/XXX.jpg" srcset="https://www.XXXX.XXX/XXX.jpg 828w, https://www.XXXX.XXX/XXX.jpg 625w, https://www.XXXX.XXX/XXX.jpg 414w, https://www.XXXX.XXX/XXX.jpg 1000w" sizes="(max-width: 30em) 414px, (max-width: 65em) 744px, 607px" onclick="PopEx(this,null,null,0,0,50,'pb-img-on shrinkable');" data-pagespeed-url-hash="204305278" onload="pagespeed.CriticalImages.checkImageForCriticality(this);"></p> It's piece of source code of an web page. This part of line: data-pagespeed-url-hash="204305278" onload="pagespeed.CriticalImages.checkImageForCriticality(this);" Is added staticly by pagespeed. (I mean there is no way to edit it or add something to it).

Regarding my suggestion: If ( $webpagespeed-nonce != "" && $webpagespeed-sha == "" ) then <p><img class="posted_image zoomable" src="https://www.XXXX.XXX/XXX.jpg" srcset="https://www.XXXX.XXX/XXX.jpg 828w, https://www.XXXX.XXX/XXX.jpg 625w, https://www.XXXX.XXX/XXX.jpg 414w, https://www.XXXX.XXX/XXX.jpg 1000w" sizes="(max-width: 30em) 414px, (max-width: 65em) 744px, 607px" onclick="PopEx(this,null,null,0,0,50,'pb-img-on shrinkable');" nonce="YYY" data-pagespeed-url-hash="204305278" onload="pagespeed.CriticalImages.checkImageForCriticality(this);"></p>

pagespeed static js code would be: nonce="YYY" data-pagespeed-url-hash="204305278" onload="pagespeed.CriticalImages.checkImageForCriticality(this);"

Of course, if both of webpagespeed-nonce and webpagespeed-sha are empty. First static would be added (as it's right now).

AngelDeaD commented 5 years ago

Ping ?

sgammon commented 5 years ago

hey @oschaaf,

i wanted to just check in on this issue really quick. we are noticing some major challenges with Pagespeed and CSP. in particular:

1) PS does not seem to honor nonces issued via the origin, and embedded in the page 2) PS does not seem to understand CSP2.0/3.0, and consequently embeds violating scripts

since PS is acting on behalf of the origin in a trusted manner, why can't it just inject nonces that match any present nonce value, for scripts it adds to the page?

in any case, this is enough of a challenge for us right now that it's pagespeed or CSP, which is a lame decision to have to make. if i can help with diagnosis or PRs at all, i would be happy to contribute.

jmarantz commented 5 years ago

PRs definitely welcome!

MrPropre commented 4 years ago

Same problem here. I can't put a nonce attribute on the PS script, so the script is blocked because it violates the CSP header.

xcellenceit commented 4 years ago

Then how do we resolve this? What's the best practice?

Dreamsorcerer commented 4 years ago

I've just disabled most of the filters on affected sites. Not much else we can do, until someone has a chance to implement proper support.

jmarantz commented 4 years ago

Did you check out https://www.modpagespeed.com/doc/configuration#honor-csp ?

On Tue, Mar 31, 2020 at 12:41 PM Sam Bull notifications@github.com wrote:

I've just disabled most of the filters on affected sites. Not much else we can do, until someone has a chance to implement proper support.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/apache/incubator-pagespeed-mod/issues/876#issuecomment-606741367, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAO2IPNM2AXLAKWQHNGP2D3RKIMKDANCNFSM4BVN3RSQ .

MrPropre commented 4 years ago

@jmarantz ModPagespeedHonorCsp on directive does not do anything.

Dreamsorcerer commented 4 years ago

@jmarantz As per the discussion above, it breaks the CSP when using nonces/hashes etc.

sgammon commented 4 years ago

@Dreamsorcerer / @jmarantz unfortunately that kind of PR, at least from scratch without pointers, is outside the scope of my skillset. what i mean to say is, i'm happy to help test PRs or otherwise diagnose issues or provide help as much as i can to the relevant teams and contributors.

pauldorman commented 3 years ago

Seven years, folks. Just sayin'

We cannot use the add_instrumentation filter because of this. I think Pagespeed is amazing, and I'm really surprised that the project is short of contributors, especially for something like this.

I'd have thought that CSP would be a great opportunity for Pagespeed to show off its capabilities. It could be leading the way by allowing CSP's to be automatically inserted into content, not crippled by it!

Firegarden commented 3 years ago

I am applying a nonce to my inline javascript but that is being lost when PageSpeed rewrites the scripts. Can someone code a change to just preserve the existing nonce when it does the rewrite? Similar to how PS add the data-pagespeed-no-defer attribute.

A simple NGINX CSP implementation uses the nginx sub_filter module to replace a constant string with the current nginx request id - this works well as a nonce until pagespeed breaks it by refactoring scripts without keeping the nonce

How to CSP in NGINX Details

  1. Add the header add_header Content-Security-Policy "default-src 'self'; style-src 'nonce-$request_id'; script-src 'nonce-$request_id'

  2. Replace CSP_NONCE constant using Nginx sub module

sub_filter_once off; sub_filter CSP_NONCE $request_id;

  1. Reference constant inside the html files

<script nonce="CSP_NONCE"

Problem

Pagespeed does not respect the nonce that is being set on the scripts that it is rewriting.

<script data-pagespeed-no-defer>(function(){

We can not fully use Pagespeed today now due to it's lack of support for even maintaining a CSP. This problem is not going away.

Proposed Solution

Can someone code the changes to ensure Pagespeed will start to respect and reuse any existing nonce="" attributes being set?

e.g. just carry over the nonce from the source scripts

<script nonce="CSP_NONCE" data-pagespeed-no-defer>(function(){

Any takers?

rwasef1830 commented 3 years ago

There's lot of problems I discovered with pagespeed which made me pretty much abandon it entirely except for the webp conversion part and even that I am considering just doing it in the application layer directly. It seems this project was given to the Apache foundation to be its "grave".

  1. the image resizing feature doesn't consider aspect ratio (the resize to rendered dimensions filter).
  2. the css inlining doesn't work at all if you use any advanced css features (variables, keyframes), the css parser is outdated.
  3. it doesn't do anything with csp except ignore some filters when you set honor csp on. it can't even read the current nonce from http header and apply it on inline scripts it writes. and to make things worse, the inline script it writes contains an ever changing embedded part (server generated nonce), so you can't even try to work around this by using unsafe-hashes and hashing the content of the inline script yourself.

About the only useful feature it has right now maybe to recompress images to be more efficient and resize images to tag specified width and length and conversion to webp. That's it in my opinion.

I tried to modify / fork it but the build system is complex, and the projects are not isolated, you are forced to compile an apache and an nginx and both modules and even chromium just to build it, and the scripts are already broken on newer distros. I gave up on trying, and just implemented some of those features in my own apps by hand.

Firegarden commented 3 years ago

Thank you very much for sharing these points. I am going to abandon using pagespeed module given your points are valid.

I was blindly trusting this source code https://github.com/apache/incubator-pagespeed-ngx/archive/v1.14.33.1-RC1.zip

Given the webp support is still lacking aspect ratio, and to be honest I have had some poor quality image renders

I am curious to talk shop with you about a leading edge web application stack if you want to email me at Firegarden I will follow up.

Thank you