rails / importmap-rails

Use ESM with importmap to manage modern JavaScript in Rails without transpiling or bundling.
MIT License
1.03k stars 110 forks source link

Security: verify integrity of pinned resources? #122

Closed botandrose closed 1 year ago

botandrose commented 2 years ago

Hey folks, I've been using importmap-rails in some new projects, and I love it! Recently, a colleague brought up a good point regarding the lack of checksum integrity verification. Looking around a bit, it seems the import-maps spec authors are well aware of the issue, but are still working towards consensus on a design:

https://github.com/WICG/import-maps/issues/174 https://github.com/WICG/import-maps/issues/221 https://github.com/guybedford/import-maps-extensions#integrity https://github.com/tc39/proposal-import-assertions/issues/113

In the meantime, the shim already included in this project contains its own (optional) integrity verification implementation: https://github.com/guybedford/es-module-shims#enforce-integrity

And so, I'm opening this issue to see if there is interest in a PR to leverage this existing implementation. I'm imagining something like:

./bin/importmap pin md5 --integrity

This option would do the work of calculating the checksum, and add it to config/importmap.rb:

pin "md5", to: "https://cdn.jsdelivr.net/npm/md5@2.3.0/md5.js", integrity: "sha384-oqVuAfXRKap7fdgcCY5uykM6+R9GqQ8K/uxy9rx7HNQlGYl1kPzQho1wx4JwY8wC"

Which would then add it to the importmap, with two ESMS options enabled:

<script type="esms-options">{ "polyfillEnable": true, "enforceIntegrity": true }</script>
<script type="importmap">...</script>
<link rel="modulepreload" href="https://cdn.jsdelivr.net/npm/md5@2.3.0/md5.js" integrity="sha384-oqVuAfXRKap7fdgcCY5uykM6+R9GqQ8K/uxy9rx7HNQlGYl1kPzQho1wx4JwY8wC" />

Note that this would require the polyfillEnable: true option to be turned on so that the shim is used for ALL clients. Otherwise, this integrity checking would be bypassed when using native implementations, e.g. Chrome.

Should I proceed with a PR?

dhh commented 2 years ago

Turning polyfill on for everything would be a performance hit for Chrome etc. But I'll delegate to @guybedford on that.

botandrose commented 2 years ago

@dhh @guybedford yeah, for sure. I'm imagining that the people who are interested in this feature would be willing to trade performance for security. Although, that's assuming that the polyfill's performance hit on Chrome is acceptable.

guybedford commented 2 years ago

On average, enforcing polyfill / shim mode results in about a 1.6x slowdown for a well optimized app (https://github.com/guybedford/es-module-shims/tree/main/bench#native-v-polyfill-performance). If that slowdown is considered worth the security benefits that's definitely a decision that could be made via an opt-in. @botandrose these workflows are relatively new - just let me know if you run into any issues at all with the approach.

junket commented 1 year ago

+1 for this PR. Using importmaps with CDN'd resources has been such a joy compared to bundling JS. Thanks all around to the folks here for making that a reality.

@botandrose did you end up working on a PR? I could take a stab. I have a few background questions, sorry if they are dumb:

Thanks again, all!

botandrose commented 1 year ago

@junket I never did! I started a spike, but stopped when I hit some kind of wall. I don't remember what it was exactly... probably nothing showstopping.

I just decided that I'm personally more than happy to take these risks in exchange for ditching node, webpacker, and yarn dependencies, so it was mostly just a curiosity after my friend mentioned his concerns. I'd love to see this cross the finish line, though!

dhh commented 1 year ago

My recent thinking is that I don't think it makes sense to depend on a CDN dynamically. We should actually flip the default so import map downloads the file instead and you check it in. That's how we run in production at HEY.

botandrose commented 1 year ago

@dhh Interesting. This is something I've also been thinking about lately.

Issues with status quo:

I've been using the default settings (without --download), and I see the following drawbacks:

  1. The app requires a internet connection to run system tests, and even to develop in dev. Browser caches alleviate this somewhat, but not entirely. I've been travelling through countries with shit quality internet and expensive metered bandwidth, so I've been especially feeling this pain. For example, a test will timeout and fail, and I waste 10 minutes figuring out that it was a CDN HTTP connection hanging due to flaky internet.
  2. As discussed in this PR, its vulnerable to CDN takeover and other types of content poisoning.

Pinning with --download addresses these issues (well, perhaps not 2 entirely, but reduces the impact significantly), but has the following tradeoffs:

  1. I feel "yucky" checking every single JS dep directly into the project repo... imagine checking in node_modules or vendor/bundle... you know what I mean?
  2. We lose the benefits of pulling from a CDN, like minimal geographical latency. This is solvable, of course, but would be extra work to set up.

Solution?

That said, I've been thinking about a third path... some way to alleviate all these issues at once. I've been imagining some kind of server that runs in the Rails process, somewhat similar to Sprockets or Propshaft. It would have the following characteristics:

  1. When pinning a dependency, it would follow the import tree and download the files into an internal cache somewhere in tmp/, and also calculate and save the appropriate resource integrity SHAs, probably somewhere in config/importmaps.rb. This is essentially my initial --integrity idea above. So a subsequent git commit would only include the new pin and its integrity SHA, and not the entire tree of JS files. At this point, one would presumably vet the dependency to make sure its legit and not compromised.
  2. In development and test mode, it would serve the files directly from its cache. If the cache files are missing, it would transparently download them from the source, verifying them with the saved integrity SHA. This scenario would happen when a second dev machine pulls a new commit that includes new pins, for example.
  3. In production, the import map would point directly to the CDN, with the integrity attribute included.

I think this approach could give us the best of both worlds. I would be happy to throw some time at this, if you think it sounds good. Thoughts?

dhh commented 1 year ago

I don't mind an exploration towards integrity going forward, but the solution as you propose, in terms of a default, doesn't pass the proportionality test for me vs just checking in your dependencies. I understand the slight ick about that, but the importmaps approach is explicitly NOT meant to produce node_module-type situations with hundreds or thousands of dependencies. All the import map dependencies are prelinked and precompiled. So for, say, HEY, I think we have like 10 vendored libs? Maybe even less. That scope warrants neither depending on a secondary CDN nor a more complicated build process.

botandrose commented 1 year ago

@dhh I can understand that perspective. Perhaps I am being too princess-and-the-pea about simply checking in some deps. So I gave --download a go on two of my non-trivial current projects to see what the actually reality is, vs my gut feeling.

This first one is 10 years old and evolved through sprockets->webpacker->importmaps over the years. Here's the situation after trying --download:

 $ tree vendor/javascript/
vendor/javascript/
├── autocompleter.js
├── buffer.js
├── events.js
├── file-type.js
├── form-serialize.js
├── @hotwired--stimulus.js
├── ieee754.js
├── inherits.js
├── is-blank.js
├── is-empty.js
├── is-present.js
├── is-whitespace.js
├── js-cookie.js
├── @kollegorna--cocoon-vanilla-js.js
├── #lib--internal--streams--from.js.js
├── #lib--internal--streams--stream.js.js
├── node:buffer.js
├── node:stream.js
├── object-traversal.js
├── peek-readable.js
├── process.js
├── @rails--activestorage.js
├── @rails--request.js.js
├── rails-request-json.js
├── @rails--ujs.js
├── readable-stream.js
├── readable-web-to-node-stream.js
├── safe-buffer.js
├── sortablejs.js
├── stimulus-use-actions.js
├── string_decoder.js
├── strtok3--core.js
├── taggle.js
├── token-types.js
├── util-deprecate.js
└── util.js
0 directories, 36 files

$ du -ch vendor/javascript/
356K    vendor/javascript/

This second one is less than a year old, and was greenfield, using importmaps from the beginning:

$ tree vendor/javascript/
vendor/javascript/
├── crypto.js
├── current-device.js
├── fetch-progress.js
├── form-serialize.js
├── hotkeys-js.js
├── @hotwired--stimulus.js
├── @hotwired--turbo.js
├── @hotwired--turbo-rails.js
├── lit--directives--style-map.js.js
├── lit-element--lit-element.js.js
├── lit-html--directives--style-map.js.js
├── lit-html--is-server.js.js
├── lit-html.js
├── lit.js
├── @lit--reactive-element.js
├── navigo.js
├── openseadragon.js
├── @rails--actioncable--src.js
├── @rails--activestorage.js
├── seedrandom.js
├── shuffle-seed.js
├── sortablejs.js
├── spinning-globe.js
├── stimulus-use-actions.js
├── stimulus-use.js
└── three.js
0 directories, 26 files

$ du -ch vendor/javascript/
3.2M    vendor/javascript/

I have several other apps that are closer to HEY, in terms of these metrcs. But many are like these, too.

So in my subjective opinion, it does indeed pass the porportionality test. I imagine there are much more pathological cases out there than mine. Maybe webpacker would simply be a better choice for these projects! But I'm going to try to make importmaps work for projects like these, because to be honest, I really don't want to go back! So, thank you for making this Rails + Import Maps marriage not only possible, but so enjoyable.

In any event, I'll continue pushing ahead on this to scratch my own itch. I expect an integrity PR will fall out of it at the very least. If there's no appetite for anything more, I'll go ahead release a gem that wraps this one.

Thanks for your time and attention!

dhh commented 1 year ago

I think the better metric to look at is churn. Are these volatile dependencies that you frequently swap in and out? Constantly updating versions? If so, yeah, I can see how the git noise would be a bit annoying. If not, I'd just check it in and forget about it 😄. But also fine to tickle the curiosity and see what a different solution might look like!

botandrose commented 1 year ago

That's a really good point. One and done wouldn't be so bad, even if the commit is rather large.

Personally, I try to bump all my deps every six months or so to stay on top of the ecosystem and security updates. I also bump individual packages, like for example when I want to use some newly-released features in stimulus. So for me I would be feeling the "yuck" somewhat often.

But yeah, I'll try to get a prototype going, and we'll see what happens from that. Thanks again!