whatwg / html

HTML Standard
https://html.spec.whatwg.org/multipage/
Other
8.16k stars 2.69k forks source link

Make the HTML parser put audio/video/source/iframe/canvas-in-svg into the HTML namespace #919

Open tabatkins opened 8 years ago

tabatkins commented 8 years ago

The SVG2 spec defines that the HTML audio/video/iframe/canvas elements should work in SVG. That's a little bit inconvenient in SVG's XML serialization, as you have to include another namespace, but it's actually impossible currently in embedded SVG, as you don't use namespaces manually at all.

The parser, when it switches into parsing SVG, should have a list of elements that are put into the HTML namespace automatically. At minimum, this should be audio/video/iframe/canvas, + picture. Child elements should also automatically be in the HTML namespace (unless you hit an subtree, obviously).

annevk commented 8 years ago

@tabatkins will you try to make a patch or is this for us to figure out?

Also, is there a public discussion somewhere where the SVG community says they want to change the HTML parser to this effect?

tabatkins commented 8 years ago

I'm willing to write a patch if you want.

Also, is there a public discussion somewhere where the SVG community says they want to change the HTML parser to this effect?

I'm not sure what you mean. The SVGWG is only marginally concerned with embedded SVG - the parser for that is HTML's responsibility. (That said, there has been discussion about defining a way to do standalone SVG by just jumping straight into HTML's foreign-content parser. But that's a different issue.)

annevk commented 8 years ago

I'm curious whether this is your idea, or whether this is something that multiple implementers and developers want.

tabatkins commented 8 years ago

The first part (making some HTML elements work in SVG) is an SVGWG-approved thing, with implementor approval.

Altering embedded-SVG parsing hasn't been explicitly discussed much, but it's an obvious outgrowth, given that it's impossible to do html-namespace elements in embedded SVG currently.

(In particular, wanting <picture>/srcset in SVG is something I've heard from authors, for obvious reasons.)

annevk commented 8 years ago

I guess we should also do this for all custom elements?

tabatkins commented 8 years ago

Are custom elements automatically in the HTML namespace when you register them? If so, then definitely yes. (Same with <template> too then, I guess?)

domenic commented 8 years ago

Are custom elements automatically in the HTML namespace when you register them?

Yep

AmeliaBR commented 8 years ago

For cross-reference, the issue on the SVG WG spec tracking this: https://github.com/w3c/svgwg/issues/240

I would definitely agree with @tabatkins that SVG's adoption of iframe, canvas, video, and audio is of limited use without HTML parser recognition.

I would be more cautious about picture (which SVG doesn't yet recognize) and custom elements (since SVG would like to eventually be able to define our own custom elements, which can be created as extensions of SVGElement, not HTMLElement).

birtles commented 8 years ago

For reference, we've discussed the possibilities for better integrating HTML and SVG many times in the past including a few members of the SVG WG meeting up with Hixie over lunch a few years back to discuss this (notes). I I think this has had pretty significant discussion and the proposal here is fairly conservative but still very useful. (Not sure about <picture> or custom elements, but we need to do something about <template> too.)

An external contributor is interested in implementing this in Gecko so I'd like to resolve this issue.

annevk commented 8 years ago

@birtles changing the HTML parser will require support from at least two implenters (ideally all), tests, and a PR for the HTML Standard. Implementing before those are in place would be premature.

birtles commented 8 years ago

I'm pretty sure tests would be included in the PR Tab is offering to write 😉

@hsivonen, @rniwa, @zcorpan, @pmeenan, @sideshowbarker, @jacobrossi any concerns about making the HTML parser put a limited whitelist of HTML elements (iframe/video/audio/canvas) in an SVG subtree in the HTML namespace? (None of the elements in the list trigger HTML's "escape foreign content mode" behavior.)

zcorpan commented 8 years ago

I think this would change the behavior of tags that do escape foreign lands inside these elements.

The spec says for such tags:

Pop an element from the stack of open elements, and then keep popping more elements from the stack of open elements until the current node is a MathML text integration point, an HTML integration point, or an element in the HTML namespace.

https://html.spec.whatwg.org/multipage/syntax.html#parsing-main-inforeign

So in <div><svg><video><xyz><b>, currently b is inserted into the div. If we just change the namespace of video, the b would be inserted into the video element, and possibly the parser would be in a confusing state (parsing in "HTML" mode while still being in an svg subtree).

AmeliaBR commented 8 years ago

@zcorpan As described in the SVG spec, arbitrary HTML elements would be allowed inside these elements, even when inside an SVG tree. This is similar to being inside an SVG <foreignObject> subtree. The elements would provide fallback content (or accessible semantic content, for <canvas>) as they normally do. When the fallback is used, the parent media element would be rendered and laid out within the SVG coordinate system as if it was a <foreignObject>.

This does mean that some elements which currently auto-close an <svg> may no longer do that if there is one of these embedable elements in between. But the markup would have to be pretty messy for this to make a difference.

AmeliaBR commented 8 years ago

By the way, if you're going ahead with adding a whitelist, it might be nice to also include the HTML metadata elements mentioned in https://svgwg.org/svg2-draft/struct.html#HTMLMetadataElements, specifically <link> and <meta>, and possibly <base> (if <base> is currently allowed outside of the <head> in HTML documents).

These elements were mainly added to the content model so that they can be used in image/svg+xml documents, but it would make it easier for authors if they were also valid within inline SVG.

<script> and <style> should continue to be parsed as the SVG versions. The HTML versions were added to the SVG content model to avoid confusing inconsistencies with script-created DOM nodes.

te-fukuda commented 8 years ago

@birtles @annevk @AmeliaBR is there any simple way to modify the HTML parser at least to move forward with implementing of iframe in SVG side?

AmeliaBR commented 8 years ago

@te-fukuda

I think iframe is no easier or more difficult than any of the others: it can also have fallback content. As far as I understand the spec, although the "allowed content model" is only text in HTML and nothing in XML, the HTML parser still parses it as a full fragment:

When used in HTML documents, the allowed content model of iframe elements is text, except that invoking the HTML fragment parsing algorithm with the iframe element as the context element and the text contents as the input must result in a list of nodes that are all phrasing content, with no parse errors having occurred, with no script elements being anywhere in the list or as descendants of elements in the list, and with all the elements in the list (including their descendants) being themselves conforming.

I'd rather get all the necessary changes into the parser spec at once. But if others think it would be easy to make an exception for empty iframe, that would still be something.

annevk commented 8 years ago

If you want to make a change I suggest you do it all at once and come with a complete plan for the changes, tests, and implementer commitments. Making piecemeal changes to the HTML parser is not desirable.

hsivonen commented 7 years ago

I think we shouldn't do this.

HTML parsing is security-sensitive. HTML producers need to be able to rely on how a given piece of HTML parses in order to be able to implement XSS-safe HTML generation.

From this perspective, there's a lot of value in multiple browser engines having converged on a single parsing algorithm. We shouldn't change the algorithm just to add convenience for SVG or to make new HTML features seem more logical (introduce the same tag omission features as for analogous old elements, etc.).

Even if all browsers were on a six-week release cycle, server-side libraries don't have auto-updaters even if the upstream repos of the server-side libraries were on a six-week cycle.

Introducing differences between the server-side understanding of HTML works differs from the browser-side understanding poses enough security risk that we should avoid it.

CC @cure53 @wycats

tabatkins commented 7 years ago

This argument means that we can never do anything with SVG-in-HTML, tho. No custom elements, no template, it's all ruled out due to the bizarro parsing that SVG-in-HTML had to invent. This renders the entire feature a second-class citizen in HTML.

This might be what we need to do! I'm willing to trust your judgement here, even if I disagree with the results. But it suggests that we have a fairly serious problem, in that there's a whole, useful feature, which could benefit quite a bit from integrating with other parts of HTML/DOM, but is prevented from doing so for legacy reasons.

Perhaps, then, we should look into a fix for this. As an example: add a new <vector> element. This opens up an embedded SVG, but without the special parsing rules of <svg> and its contents; instead we just parse as normal for HTML, with exactly the same tree we'd get for unknown elements. The SVG elements can then get put in the SVG namespace as appropriate, while HTML elements remain in HTML. Without weird parsing, newer HTML things (custom elements, template, etc) just work automatically.

This does mean we'd lose the self-closing thing that SVG-in-HTML can currently do, but eh. Worth it if that's the sacrifice necessary to keep vector graphics usable in HTML.

strarsis commented 7 years ago

Is srcset in elements possible in a SVG file? Are there any information on browser support?