jonathanKingston / ember-cli-sri

Generation of SRI hashes for ember
MIT License
48 stars 15 forks source link

Script tags not receiving integrity attributes #20

Closed mwpastore closed 8 years ago

mwpastore commented 8 years ago

Hi folks. Sorry for the question-like issue report but not sure where else to ask. I'm just getting started with SRI. My link tags are receiving an integrity attribute after installing and enabling ember-cli-sri but my script tags are not. I thought maybe it was due to the paranoiaCheck but it looks like it is disabled by default the current version of broccoli-sri-hash. Any suggestions?

jonathanKingston commented 8 years ago

Hey!

mwpastore commented 8 years ago

Hi Jonathan, thank you for the quick response!

Thank you in advance!

jonathanKingston commented 8 years ago

Very curious certainly. The code checks a few things:

So yeah other than those, perhaps there is a bug in how the paranoiaCheck happens but I dunno why that would happen.

Is there anything different about the location of the script files or paths on disk?

mwpastore commented 8 years ago

Oh, right, that <base tag. :stuck_out_tongue: I think that gets set because of the baseURL setting in config/environment.js; removing it removes the <base tag, but causes my site to not load, so that's not good!

I don't think so. I'm using the standard index.html and asset layout without any weird customization or anything. I'm not smart enough to be doing anything fancy! The fact that it can find and correctly handle the .css files seems to confirm that, no? I'm using a fairly up-to-date ember-cli (1.13.14) and I always do an ember init and walk through the changes.

I am going to work for a bit and refer back to this post to see if I can figure out which of these steps is failing. I also might try creating a clean Ember project to see if I can reproduce this issue in the simplest case. If not then I know I must have some weirdness somewhere.

jonathanKingston commented 8 years ago

Manually editing your node_modules/ember-cli-eslint/node_modules/broccoli-sri-hash/index.js to have console.logs in at build time is a simple way to debug when the tests are failing.

Yeah I'm just happy to believe there are enough moving parts to sometimes fail these checks so happy to help debug this through, you know it could even be the regex or whatever.

Sure thank you for walking through the issue, without a simpler reproduction it's pretty difficult to help however happy to help as much as possible.

mwpastore commented 8 years ago

Sorry for spamming this issue report. I can be found on the EmberJS Community Slack if you'd rather chat.

I created a brand-spanking-new project with ember-cli 1.13.14 and installed ember-cli-sri (2.0.0). With the default settings, everything works fine, but with SRI.crossorigin = 'anonymous' and fingerprint.prepend set to an absolute URL, I observe this (repeatable) behavior:

$ rm -rf dist
$ ember build --environment production
version: 1.13.14
Built project successfully. Stored in "dist/".
$ grep integrity dist/index.html 
    <link rel="stylesheet" href="http://ember-cli-sri-troubleshoot-2.example.com/assets/vendor-d41d8cd98f00b204e9800998ecf8427e.css" integrity="sha256-47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU= sha512-z4PhNX7vuL3xVChQ1m2AB9Yg5AULVxXcg/SpIdNs6c5H0NE8XYXysP+DGNKHfuwvY7kxvUdBeoGlODJ6+SfaPg==" crossorigin="anonymous"  >
    <link rel="stylesheet" href="http://ember-cli-sri-troubleshoot-2.example.com/assets/ember-cli-sri-troubleshoot-2-d41d8cd98f00b204e9800998ecf8427e.css" integrity="sha256-47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU= sha512-z4PhNX7vuL3xVChQ1m2AB9Yg5AULVxXcg/SpIdNs6c5H0NE8XYXysP+DGNKHfuwvY7kxvUdBeoGlODJ6+SfaPg==" crossorigin="anonymous"  >
    <script src="http://ember-cli-sri-troubleshoot-2.example.com/assets/ember-cli-sri-troubleshoot-2-20900e6612456ee9874f8e346b1dbe88.js" integrity="sha256-tbueETztFH9+e5vRfLanFvjOMwW+vnoeQatBsUiOboE= sha512-xSZ3Ryzlqpuoo6+uB0yMj5ILUqV72ouGlY4TKq/G5M5MSdTKaJ67nibjgjLH47/sM7rgGl3p0byE/kc4vvDLsg==" crossorigin="anonymous"  ></script>

Wait, where'd vendor.js go?

$ grep vendor..*js dist/index.html 
    <script src="http://ember-cli-sri-troubleshoot-2.example.com/assets/vendor-c38f19f6a489002c6b2517845513e2f1.js"></script>

It's not clear that this is the same exact issue I'm having in my project, but vendor.js should be getting tagged and is not!

UPDATE: no change after upgrading to ember-cli 1.13.15

UPDATE: the md5sums match for the other three files, but not for vendor.js. moreover, the md5sum computed within brocolli-sri-hash appears to be wrong.

UPDATE: So it looks like broccoli-asset-rev does the md5sum on a Buffer, but broccoli-sri-hash does the md5sum on Buffer#toString(), and this appears to be the source of the discrepancy. Buffer#toString() assumes an encoding of utf8 by default; telling Hash#update that the data is encoded in utf8 (instead of treating it as binary data) appears to be one potential solution. Another approach would be to use Buffers instead of Strings to store the asset contents while working with them in brocolli-sri-hash. I hope to submit a PR shortly!

jonathanKingston commented 8 years ago

:+1: looking good, will bump the broccoli version and close this after publishing and confirmation it solves your issue with the published version.

jonathanKingston commented 8 years ago

@mwpastore published, thanks a lot for working through this! If you could let me know if that solves your issues that would be great. I had not actually uses the extension much since the unicode was enabled so thanks for checking that through.

mwpastore commented 8 years ago

Short answer: Yes, it solves the problem!

Longer answer: Yes, it solves the artificially-created problem, but it only fixes one of the two .js files in my real-world project. Either broccoli-asset-rev needs a tweak as well, or I found a bug in Node. I'll let you know.

mwpastore commented 8 years ago

Well, this has been a day. I found the problem and you're not going to like it. :sweat_smile:

The long and the short of it is that we cannot use the md5sum embedded in the filename as any sort of check or control. We must recompute the md5sum when generating the integrity. The reason for this is quite simple, but of course it took a long walk for me to come to this conclusion.

ember-cli generates the fingerprints of the assets before ember-cli-sri generates the integrity. Obviously. But then it goes through each asset and rewrites any links to other assets to include 1. the prepend (if present) and 2. the fingerprint. Therefore, any asset that has links to other assets will always fail the md5sum comparison, because the file contents change in-between the time it's originally fingerprinted and the time it's reanalyzed for subresource integrity.

We can't modify ember-cli to wait to take the fingerprints of the assets until after they've been rewritten, because it needs the fingerprints to know how to rewrite the links (Catch 22). We could possibly modify ember-cli to 1. take the fingerprints of everything but the top-level css and js files (which will never include each other), 2. rewrite only the top-level files, and only then 3. fingerprint them, but that seems like it would be... somewhat difficult.

At the risk of belaboring the point, and only because I've been working on this for about eight hours now, here's an example of how things can break down:

Stage 1:

Stage 2:

Stage 3:

We could disable fingerprinting for images, which would solve this specific problem, but if there's a prepend it will fail anyway. We could re-rewrite the files to strip out the prepend and any embedded fingerprints before taking the md5sum, and then compare that to the fingerprint, but that seems error-prone and fraught with peril. (It looks like whatever is doing that "Stage 2" rewrite subtly modifies the file in other ways, so it might not work regardless.)

At the end of the day, I think the simplest solution is to rip out the md5Matches logic and just take the md5sum of the assets as-is. At the very least, maybe we introduce a new option to enable or disable checkFingerprint. Whatever you decide I'm happy to start chipping away at a PR... just let me know.

Thank you!

jonathanKingston commented 8 years ago

Sorry been a little busy, yeah the md5 checking still is needed to make sure the file has one, however it isn't required to be the same as the file really (I mean ember could solve this with a build tree but this wouldn't be a quick fix).

I can remove the check or you can?

Thank you so much for working this through!

mwpastore commented 8 years ago

No worries! Yes, I'm happy to do the legwork. Do you think it makes sense to leave the functionality in broccoli-sri-hash, but offer a way to disable it, and perhaps disable it by default in ember-cli-sri? Or should we just rip it out altogether?

mwpastore commented 8 years ago

Hi Jonathan, I have a PR branch ready to go for ember-cli-sri, I just need to know the version of broccoli-sri-hash that includes my two fixes (jonathanKingston/broccoli-sri-hash#17 and jonathanKingston/broccoli-sri-hash#19) and of course any guidance or feedback from you on the implementation details. Thanks in advance for your time!

jonathanKingston commented 8 years ago

2.1.0 should be the release it is in :D thank you for your time also! The implementation of the broccoli changes looked great so no comments there.

In the PR you have you mention the check is there for extra security for a little extra security which is not the case, the check was there to prevent the incorrect files from being hashed which obviously would cause issues when it went live.

mwpastore commented 8 years ago

You're quite welcome! I reworded that statement; I'll open up a PR here so you can take a look at it more easily.