whatwg / html

HTML Standard
https://html.spec.whatwg.org/multipage/
Other
8.04k stars 2.63k forks source link

Include details of SVG-as-Image and Canvas origin-clean #10641

Open schenney-chromium opened 4 days ago

schenney-chromium commented 4 days ago

The canvas cross-origin tainting behavior of SVG-as-image content varies across browsers, and is not specified in any way. The incompatibility arises from the treatment of SVG foreignObject elements in SVG content used as a source for HTML or SVG objects.

The canvas spec currently does not specifically address SVG content as an image source, it just says a HTMLOrSVGImageElement is not origin-clean when the "image's current request's image data is CORS-cross-origin"

Going over to the images portion of the spec we have the statement about CORS cross-origin being important to canvas, and we also have "The src attribute must be present, and must contain a valid non-empty URL ... referencing a non-interactive, optionally animated, image resource that is neither paged nor scripted." and note that explicitly calls out SVG and restricts HTML.

Finally we have the SVG spec saying

image references:
    An SVG embedded within an image element must be  processed in *secure animated mode* if the
    embedding document supports declarative animation or in *secure static mode* otherwise.

    The same processing modes are expected to be used for other cases where SVG is used in place of a raster image,
    such as an HTML ‘img’ element or in any CSS property that takes an <image> data type. This is consistent with
    HTML's requirement that image sources must reference "a non-interactive, optionally animated, image resource
    that is neither paged nor scripted"

Browser behavior is in conformance with all this spec text thanks to limitations on scripting and external links in SVG foreignObject sub-trees and when SVG is used as an image source.

I would like to put up a PR to explicitly discuss the SVG-as-image case in the canvas image sources section on setting the cross-origin status of image content. Any objections?

Secondly, do we want to specify the origin-clean behavior as implemented by browsers for SVG with foreignObject content? That is, the dependency on how the image src is provided.

Finally, do we want to specify a common behavior for the blob case, or leave that up to browsers?

@domenic

Related issue for VideoSource: #10489

Kaiido commented 4 days ago
  • All browsers consider SVG-as-image to be not origin-clean when the image source is a regular url pointed to SVG with a foreignObject tag, such as src="svg-with-foreign-object.svg".

This doesn't seem to be true. As I already pointed out in the related chromium issue, Firefox does not taint same-origin SVG images at all. I share the test here too: https://cyclic-concise-litter.glitch.me/ (source)

Currently, Chromium and WebKit consider an SVG src given as a blob in Javascript, [...] to be not origin-clean if it contains a foreignObject tag or has not origin-clean images.

How can it have not origin-clean images? From the specs that you mentioned after, and on which all UAs agree, in secure static mode no external request can be made from the image. i.e. all its resources (images, filters, fonts, stylesheets, etc.) have to be included in the file itself (be it as data: URL).

I would like to put up a PR to explicitly discuss the SVG-as-image case in the canvas image sources section on setting the cross-origin status of image content. Any objections?

No objection on my side to add more details in the specs, but I wonder why Chrome and WebKit had this restriction to begin with. IIUC the original reason was that this could expose privacy content like :visited, user's themes, etc. Some of these can already be exposed even without a foreignObject, so it's not clear if the specs should call it out explicitly or just be evasive and add a note like "in case the UA thinks drawing an image could expose sensitive information, it's is allowed to mark it as not origin-clean". Similarly, it's unclear if SVG will always be the only image format that can expose such privacy/security related risk. Currently the section about marking image sources as tainting the canvas only handles the source objects, so in this case <img> or <image> elements. It doesn't really consider what these point to. So I wonder if this shouldn't be added to the "updating the image data" side rather than in the "canvas image sources section".

Secondly, do we want to specify the origin-clean behavior as implemented by browsers for SVG with foreignObject content? That is, the dependency on how the image src is provided.

I believe Firefox's behavior is the most sensible and corresponds to the current specs: if the resource is served as same-origin, it's origin-clean. blob: is, I believe, always same-origin and data: URLs, while opaque are treated as origin-clean in this context (same as for any other image format). This dependency on how the image src is provided seems like browser bugs to me. It's been a long time, but from my recollection WebKit actually used to taint even data: URLs that contained <foreignObject>. Not sure when it's been relaxed but I have some trouble to see what would make the data: case safer than the blob: case or even than the same-origin http: one.


cc @whatwg/canvas

schenney-chromium commented 4 days ago
  • All browsers consider SVG-as-image to be not origin-clean when the image source is a regular url pointed to SVG with a foreignObject tag, such as src="svg-with-foreign-object.svg".

This doesn't seem to be true. As I already pointed out in the related chromium issue, Firefox does not taint same-origin SVG images at all. I share the test here too: https://cyclic-concise-litter.glitch.me/ (source)

Mmm. I swear I ran your test case a couple of weeks back and Firefox tainted, but maybe I was testing a different SVG file with different properties or the file was local and hence not same origin.

Currently, Chromium and WebKit consider an SVG src given as a blob in Javascript, [...] to be not origin-clean if it contains a foreignObject tag or has not origin-clean images.

How can it have not origin-clean images? From the specs that you mentioned after, and on which all UAs agree, in secure static mode no external request can be made from the image. i.e. all its resources (images, filters, fonts, stylesheets, etc.) have to be included in the file itself (be it as data: URL).

Yes, there should not be cross origin content in the blob but the Chromium code, which is almost certainly still the same as the Webkit code, checks the images for anything other than data URIs.

I would like to put up a PR to explicitly discuss the SVG-as-image case in the canvas image sources section on setting the cross-origin status of image content. Any objections?

No objection on my side to add more details in the specs, but I wonder why Chrome and WebKit had this restriction to begin with. IIUC the original reason was that this could expose privacy content like :visited, user's themes, etc. Some of these can already be exposed even without a foreignObject, so it's not clear if the specs should call it out explicitly or just be evasive and add a note like "in case the UA thinks drawing an image could expose sensitive information, it's is allowed to mark it as not origin-clean". Similarly, it's unclear if SVG will always be the only image format that can expose such privacy/security related risk. Currently the section about marking image sources as tainting the canvas only handles the source objects, so in this case <img> or <image> elements. It doesn't really consider what these point to. So I wonder if this shouldn't be added to the "updating the image data" side rather than in the "canvas image sources section".

I considered adding to the "updating the image data" but given it is very generic as to source I though here in the canvas location was more obvious. Though a note referencing the SVG spec about secure static context would probably be a good idea.

I did test Chromium behavior around themes and links etc in foreign object and the only leakage of any privacy concern is OS level color contrast settings and the like, plus OS level scrollbar settings.

Secondly, do we want to specify the origin-clean behavior as implemented by browsers for SVG with foreignObject content? That is, the dependency on how the image src is provided.

I believe Firefox's behavior is the most sensible and corresponds to the current specs: if the resource is served as same-origin, it's origin-clean. blob: is, I believe, always same-origin and data: URLs, while opaque are treated as origin-clean in this context (same as for any other image format). This dependency on how the image src is provided seems like browser bugs to me. It's been a long time, but from my recollection WebKit actually used to taint even data: URLs that contained <foreignObject>. Not sure when it's been relaxed but I have some trouble to see what would make the data: case safer than the blob: case or even than the same-origin http: one.

You are right about Webkit once tainting even data URI, but that was changed a few years ago to match Chromium. I'll have to check how Chromium tracks the origin of the SVG content itself when used in an image before I have a strong opinion on what security and privacy looks like.

cc @whatwg/canvas

annevk commented 3 days ago

It seems like a security bug if <foreignObject> with HTML content doesn't end up tainting. Painting HTML and getting pixel data is an API we have expressly not provided due to security concerns (e.g., revealing theming as you mention) so it would be bad if SVG could be used to nonetheless do it.

Without <foreignObject> I think purely same-origin SVG should be fine to not taint.

Now blob: URLs are not always same-origin. Imagine a cross-origin nested document where you mint a blob: URL and then transmit it to its parent. I would expect that to result in a tainted canvas when drawn.

schenney-chromium commented 3 days ago

The OS theme information that is revealed is largely revealed in other ways already. Note I am not talking about browser theme information. High-contrast color settings or anything else that changes system colors already impacts things like the background color of a canvas, which can be read back. Scrollbar settings impact layout dimensions that can be queried. It's a fingerprinting privacy risk that has already been accepted by the community.

You raise an interesting point about cross-origin iframes (etc.) creating and transmitting blob urls. I will look into that, because it would also apply to blob bitmap image sources.

annevk commented 3 days ago

I don't think we have accepted the risk of reading back arbitrary drawn HTML as a community. Otherwise we'd have had APIs such as drawImage(nodeTree) or some kind of screen capture thing as has often been proposed.

Kaiido commented 3 days ago

Otherwise we'd have had APIs such as drawImage(nodeTree)

The added security layer here is that the HTML is "sandboxed" in the SVG document, itself sandboxed in the <img> as "secure static mode" (so no external resource, no origin, no history, no scripts, etc.). I'm not contesting that it'd be good to have a discussion on this, but it's there for years in every browsers at least from data: URLs, and some libraries do rely entirely or partially on this feature, removing it now might have consequences on a number of websites suing these.
Also, it might be good to note that the revival of the chromium issue is caused by the placeElement() proposal which will have to answer at least the same set of questions.

Imagine a cross-origin nested document where you mint a blob: URL and then transmit it to its parent. I would expect that to result in a tainted canvas when drawn.

You'll know better than me, but in that case wouldn't the <img> itself fail to fetch the image anyway and thus for the cross-origin document the blob: URL is no more than a string that starts with "blob:"? (It seems to be what happens in all browsers). Also this brings to mind another close scenario: Should an ImageBitmap sent from such a cross-origin document become "origin-dirty"? It currently doesn't, but the "owner" has to postMessage it and it could have sent the original Blob anyway, so I'm not sure how bad it is but I'm no security expert. Maybe the difference is that a blob: URL could be "guessed" somehow?

annevk commented 2 days ago

From that first library you point to:

Safari is not supported, as it uses a stricter security model on <foreignObject> tag.

I'd love to know where other browsers decided that it was a good idea to allow readback of form control pixels, etc. That sounds very broken to me and does not match the security conversations we've had over the years. cc @whatwg/security


There might indeed be a same-origin restriction on fetching blob: URLs, though this is not well-tested nor specified (I think). In which case there is no issue there. When you message an object there's no need for tainting. Objects (typically) don't have an associated authority as we somewhat try to follow an object-capability model for message passing.

schenney-chromium commented 2 days ago

Safari does not taint the canvas when a data URL is given as an image source, regardless of it's contents. So even there a path exists to get forms and scrollbars etc. The change was made several years ago.

Firefox has supported the blob URL case "forever" even with blog posts explaining how to use it.

Chrome supports the data URL case and I was planning to enable the blob case but that now depends on resolving some issues, plus getting approval to ship.

Removing the data URL behavior would break the web at this point. The reports flew in when Chrome accidentally broke it once.

annevk commented 2 days ago

I don't think data: URL support matters. In fact, I would expect that to work, it works for PNG and other formats too. The issue I see is with <foreignObject>.

schenney-chromium commented 2 days ago

I don't think data: URL support matters. In fact, I would expect that to work, it works for PNG and other formats too. The issue I see is with <foreignObject>.

Sorry for not being clear. When "WouldTaintCanvas" checks look at an <img> tag and sees a data URL (in Chromium ProtocolIsData()), no checks are made at all on the content. So an SVG with a foreign object encoded in a data URL will not taint in that case. The WebKit bug with the patch that matched the behavior is https://bugs.webkit.org/show_bug.cgi?id=180301

There's a WPT for it: html/canvas/element/manual/drawing-images-to-the-canvas/drawimage_svg_image_with_foreign_object_does_not_taint.html