privacycg / proposals

New proposals in the Privacy Community Group
https://privacycg.github.io
122 stars 5 forks source link

JS Isolation via Origin Labels and Membranes #3

Open pes10k opened 4 years ago

pes10k commented 4 years ago

This proposal has moved into its own repository. Please file issues there.

https://docs.google.com/document/d/1GFWONU2lq9ukQoj6dIGudOO4P3op7a1xt75Gb_jAA1c/edit#heading=h.8blcqbqrr76o

The above is an early stage Brave proposal for how to constrain scripts from the document, using a programatic, run time approach.

The idea has similarities to the COWL design that didn’t quite gain traction in WebAppSec a few years back, but with several sig differences:

  1. By design, intended to not require rewriting existing code
  2. API designed to both allow (i) page to protect itself from 3p scripts, and (ii) the browser (and extensions) to protect browser state from 1st and 3p (e.g. privacy protections)
  3. Designed to allow filter-list style curating and sharing of isolation policies
  4. Designed to use APIs and interfaces already in the browser (e.g. it looks much more Web like than previous suggestions)

In general, the idea is part of trying to tackle a larger category of problems we don’t have a standards-based approach to:

  1. How to protect the 1p context (especially storage)
  2. How to allow the browser or page treat different scripts w/ different level of trust / privilege

And especially to do so in a way that doesn’t break existing sites / legacy code that’ll never get rewritten

hober commented 4 years ago

This is really interesting, thanks!

What is the relationship between this and tc39/proposal-realms?

pes10k commented 4 years ago

They're related, and you could use a lot of the same machinery, but realms would require existing code to be updated, to change some of its programming model.

Realms are about "isolated worlds", or ways of really knowing some code can't modify the external environment; this proposal is (effectively) a way for privileged code to impose access controls over a single environment.

jakearchibald commented 4 years ago

Interesting! How do you decide which origin accessed the object?

// First party script /index.js
function runCallback(callback) {
  return callback();
}

function getLocation() {
  return window.location.href;
}

function getProperty(target, key) {
  return target[key];
}

// Third party script evil.com/index.js
// Who accesses the location in these cases?
runCallback(() => window.location.href);
getLocation();
getProperty(window, 'location');

Also, is there a security issue here? Currently you can't access the source of a third party script (although it's at risk due to spectre/meltdown).

// Third party script
function whatever() {
  const str = `Your script can't see this`;
  return str.slice(0, 1);
}

Will the access of str.slice(0, 1) give you a callback that lets you see str? You're accessing String.prototype.slice so it counts as something global.

pes10k commented 4 years ago

@jakearchibald thanks for the questions! We're really eager for this kind of feedback, so its great to think through these kinds of questions, and see where the proposal could be tightened / corrected (or if we're going in some goofy direction).

As written, the spec would say that its the 3p script, 1p script, and 1p script doing those accesses, respectively. So, effectively, if the 1p script had a footgun like getProperty(), the 1p could pretty quickly shoot itself in the foot, but…

But in each case, here are the places where the membrane proposal would give you hooks (to prevent this kind of creep):

runCallback case

  1. get: hook in a proxy targeting evil.com would fire with window, undefined and evil.org
  2. get: hook in a proxy targeting evil.com would fire with window, location and evil.org
  3. get: hook in a proxy targeting evil.com would fire with window.location, href and evil.org

getLocation case

  1. get: hook in a proxy targeting first-party would fire with window, undefined and 1p origin (or some 1p indicating value)
  2. get: hook in a proxy targeting first-party would fire with window, location and 1p origin
  3. get: hook in a proxy targeting first-party would fire with window.location, href 1p origin

getProperty case

  1. get: hook in a proxy targeting evil.com would fire with window, undefined and evil.org
  2. get: hook in a proxy targeting first-party would fire with window, location and 1p origin

(note this is in the maximal, I want to mediate access to everything case)

jakearchibald commented 4 years ago

Ohh, so you wouldn't see the accesses of runCallback, getLocation and getProperty? Is this assuming they're in the global module scope, rather than window.runCallback etc?

I guess if you wanted to do something about the other cases, you could provide a set of origins representing everything on the stack. Although that would be broken by anything async.

pes10k commented 4 years ago

re whatever case (last thing), JS builtin globals is a really interesting question (grateful for the feedback already!)

We wouldn't want to target them (or might want some "yes and the JS builtins too" opt in). But that if someone was going to try and leverage this to get around things, ex:

const origSlice = String.prototype.slice;
let stolenDocRef;
String.prototype.slice = function (...args) {
   stolenDocRef = window.document;
   return origSlice.apply(this, args);
}

That'd trigger 3p tainted hooks for window, undefined and window, document, etc

jakearchibald commented 4 years ago

Ah, ignore my comment about exposing the string. Since you can overwrite prototype methods you can already get access to str. Sorry for the noise.

pes10k commented 4 years ago

Ohh, so you wouldn't see the accesses of runCallback, getLocation and getProperty? Is this assuming they're in the global module scope, rather than window.runCallback etc?

Yep, though (simplifying straw proposal) might be enough to say "3p can access window, but once it starts trying to grab references to anything document / Web API related, then hooks kick in. i.e. first parties should not use this as a replacement for protecting 1p JS defined state / structures, there are already ways of doing that (modules, closures, etc).

So if the first party defines window.getCookies => document.cookie, thats not a problem the proposal is trying to solve. There are already ways of solving that problem. Just for mediating access to document, etc, which doesn't have a good solution currently

jakearchibald commented 4 years ago

Is there a way to define which property accesses this would protect, or would it just be a manually maintained list?

pes10k commented 4 years ago

It’d need to be manually maintained, but maybe indirectly would be good enough. Something like “anything defined in WebIDL is wrapped” would cover what’s intended

On Jan 23, 2020, at 11:52, Jake Archibald notifications@github.com wrote:

 Is there a way to define which property accesses this would protect, or would it just be a manually maintained list?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

pes10k commented 4 years ago

I've updated the spec with the following changes / clarifications:

pes10k commented 4 years ago

added further explainer text in FAQ section detailing how this proposal is fundamentally different from SES and similar "isolated worlds" approaches

jackfrankland commented 4 years ago

I really like this proposal. I have some questions if I may:

  1. It seems like there's a scenario where two membrane proxies can be registered for the same origin. Would one have a preference over the other, or do they "layer" up in a particular order?
  2. I believe a motivating use case should be to allow third parties full access to a particular DOM element for the purpose of embedding content, without giving access to the rest of the DOM and limiting access to the global scope. I'm having a hard time seeing how to effectively achieve this using this proposal. Let me know if you'd like me to be more specific, in case you think this proposal should cover this use case.
  3. The document does mention performance, but if we're talking about hundreds or thousands of DOM nodes within a single-page application - by having JS around the native HTMLElement functions, it feels like there may be a significant performance consequence.

Cheers.

jumde commented 4 years ago

@privacycg/chairs : This proposal is getting sufficient feedback from the community. We'd like to move this to a dedicated repository within @privacycg to get more feedback before we start working on the implementation. Explainer here: https://github.com/brave-experiments/js-membranes/

Let us (@snyderp and I) know if you have any questions or concerns.

hober commented 4 years ago

Adding to the agenda for our next call per offline conversation with @snyderp.

pes10k commented 4 years ago

Hi @jackfrankland

Apologies for letting this drop, I missed a ping somewhere. Thanks for your questions / thoughts!

two membrane proxies can be registered for the same origin

I think you'd need multiple proxies being able to interact / stack here, in case (say) an extension and the site both wanted to restrict scripts.

allow third parties full access to a particular DOM element for the purpose of embedding content, without giving access to the rest of the DOM

There are a lot of ways you might enforce this. But here's a silly toy example:

    // Page
    <div>
        <!-- secret stuff -->
        <section>
            <input type="text" name="ccn">
        </section>
        <!-- the script should be able to see this stuff -->
        <section id="only-for-the-pw-lib">
            <div id="password-strength-feedback-response"></div>
            <input type="password" name="password">
        </section>
    </div>
    <script src="https://example.org/js/pw-strength.js"></script>

    // Membrane
    const htmlElmProto = window.HTMLElement.prototype;
    let trustedElementRoot;
    const isParentOf = (parentElm, possibleChildElm) => {
        // some code that just walks up the tree and returns true
        // if possibleChildElm is in the tree below parentElm, and
        // otherwise false.
    }

    window.registerMembraneProxy(["example.org"], {
        get: (target, prop, scriptInfo) => {
            if (trustedElementRoot === undefined) {
                trustedElementRoot = document.getElementById("only-for-the-pw-lib");
            }

            // We only want the script to touch html elements…
            if (htmlElmProto.isPrototypeOf(target) === false) {
                return null;
            }

            // and only HTML elements in the part of the document we want it to
            // access.
            if (isParentOf(trustedElementRoot, target) === false) {
                return null;
            }

            // Otherwise do the thing…
            return Reflect.get(target, prop);
        }
    })

The above example is toy, and isn't meant to be bullet proof, just concise enough to demonstrate one possible way you'd use the machinery. Tooling would make all this much nicer, but I think the capabilities in the spec are what's needed to build on.

performance consequence

Thats def possible. Proxies are (really!) surprisingly fast, but im sure the overhead wouldn't be zero. Good tooling to help folks build tight policies would help, but I take your point, that this is def not a zero-cost proposal.

hober commented 4 years ago

@jumde wrote:

@privacycg/chairs : This proposal is getting sufficient feedback from the community. We'd like to move this to a dedicated repository within @privacycg to get more feedback before we start working on the implementation. Explainer here: https://github.com/brave-experiments/js-membranes/

Let us (@snyderp and I) know if you have any questions or concerns.

We've spun up a repo for you: https://github.com/privacycg/js-membranes

Could you and @pes10k double check you have read-write access to it? I think you may need to accept your invitation to join the Privacy CG GitHub org first, @jumde. Pete already has. Go here to do so: https://github.com/privacycg