Closed mfreed7 closed 9 months ago
To reply to https://github.com/whatwg/html/pull/5465#issuecomment-1397423778. I'm not sure I understand. By that logic, we'd need to extend XMLHttpRequest
, innerHTML
, and various other APIs that would end up prohibiting declarative shadow trees as well. (And yes, if <foobar>
had the same issues as declarative shadow trees we should obviously prohibit it by default and make enabling it possible in newish APIs only.)
setHTML()
(and proposed friends) is going to be the new HTML parser entry point, replacing innerHTML
(and friends). It's intentionally named in a generic fashion so it can be that. It's safe-by-default, but if there are cases where declarative shadow trees need some kind of unsafe entry point we ought to provide for that as well. Either as unsafeSetHTML()
or through an argument. How is it not beneficial to provide web developers with a unified API for HTML parsing?
To reply to #5465 (comment). I'm not sure I understand. By that logic, we'd need to extend
XMLHttpRequest
,innerHTML
, and various other APIs that would end up prohibiting declarative shadow trees as well. (And yes, if<foobar>
had the same issues as declarative shadow trees we should obviously prohibit it by default and make enabling it possible in newish APIs only.)
Right. We would. See https://github.com/whatwg/html/issues/6185 for context, but conceptually, the on*
event handlers for any new HTML element present an XSS risk for (very) poorly configured sanitizers today, and so by the same logic as for DSD, must be prohibited from use with all parser endpoints without an opt-in. And I'm therefore wondering if you're supportive of only supporting parsing of any new HTML element via setHTML()
and friends.
setHTML()
(and proposed friends) is going to be the new HTML parser entry point, replacinginnerHTML
(and friends). It's intentionally named in a generic fashion so it can be that. It's safe-by-default, but if there are cases where declarative shadow trees need some kind of unsafe entry point we ought to provide for that as well. Either asunsafeSetHTML()
or through an argument. How is it not beneficial to provide web developers with a unified API for HTML parsing?
I guess I just don't understand the logic of that yet. Is there additional conversation that I can refer to about this decision? For the non-sanitized parsing use case (which is by far the majority of synchronous parser usage on the open Web), I don't see the point of trying to move all parsing to yet another parser entrypoint. I do see the point of creating a new "always safe" sanitized parser API that is guaranteed to produce safe DOM from unsafe HTML. Making that API a Swiss army knife that also produces unsafe DOM from unsafe HTML - that part I don't quite understand yet.
I'm not sure I follow the thought experiment which makes it hard to comment. In general it doesn't seem like a good idea for new HTML elements to require opt-in all over the place.
I don't see the point of trying to move all parsing to yet another parser entry point.
What other entry point that's also ergonomic did you have in mind?
setHTML()
(and proposed friends) is going to be the new HTML parser entry point, replacinginnerHTML
(and friends). It's intentionally named in a generic fashion so it can be that.
One limiting thing about this is that setHTML
presumes the content is specifically an HTML fragment rather than a full document. i.e. A full document with doctype, root element, etc do not make sense for setHTML
.
How is it not beneficial to provide web developers with a unified API for HTML parsing?
Encouraging general parsing to go through Sanitizer
seems highly reasonble, so it would make sense if Sanitizer
provided a parseDocument
method to replace the capability of DOMParser
.
I think for parsing a whole document we need a streaming API as we have oft-discussed, but thus far haven't made much progress on. The other thing that's needed for that is use cases as those are quite a bit less fleshed out.
Agenda+ to discuss on Thursday. A preview of what I'd like to discuss:
setHTML()
functionality that a) parses DSD, and b) allows "unsafe-dont-sanitize" behavior. This is primarily being discussed in #185 (and a bit in #184).Overall, if all of the above questions were resolved, I would feel much better about not adding a DOMParser
entrypoint for DSD. So I think we should discuss those first and try to come to some agreements.
@otherdaniel @jyasskin @mozfreddyb
I think based on your input (in particular with regards to 2 below) the current idea is as follows:
innerHTML
ends up throwing.cc @rniwa @smaug----
Are these methods intended to work in SVG and MathML as well? If so setMarkup
etc... would be better semantically, if a bit more verbose. svgElement.innerHTML
always irks me...
It's good that you bring that up. These methods would be available where their non-HTML counterparts are available. Now one thing that the innerHTML
setter does is branch on the document's type ("xml" or "html") and use a different fragment parser based on that.
I think for these methods it would be preferable if we always used the HTML fragment parser. The XML-parser is contextless so doesn't benefit as much from having these kind of positional methods. And a document-based mode switch is also not a great API. Based on that I think the names are still good.
Few people know about these parser subtleties...
MathML and SVG are not HTML though, even when embedded. They are distinct languages and their DOM interface differs from HTML, even when they've been parsed by the HTML parser...
Equating HTML with "any markup parsed as HTML" seems like a loss of precision that could be avoided.
That being said, the URI/URL/URN was also meaningful but ignored in the wild (except for data: URIs which had some traction)... and thus URIs and URNs were retired.
A broader survey may be a good thing before settling on these names.
From a purely-DSD point of view, a single setHTML()
method would meet the developer need, provided that it:
The above aligns with your vision as you've described it. However, it sounds like part (2) is not acceptable to some of the Sanitizer folks (see comments on https://github.com/WICG/sanitizer-api/issues/185), since it eliminates the benefits of having a "guaranteed safe" output. I don't want to speak for that group, but I do see their points.
If, due to that discussion, setHTML()
needs to be broken into two methods like setHTMLAlwaysSafe()
and setHTMLUnsafe()
, then I do not think we should build setHTMLUnsafe()
. It would be almost exactly like innerHTML
, but subtly different, and this would be actively bad for developers because it's confusing and unclear which one to use. And adding CSP to any situation does not improve confusion levels.
Could you explain their points? They are not clear to me.
Anyway, assuming there is some compelling reason to duplicate the set of methods rather than have the sanitize
member accept "unsafe-none
", setUnsafeHTML()
would essentially be innerHTML
, but with declarative shadow root support, right? Still seems like we'd have a reasonably clear migration story for web developers.
I'm not sure what your jab at CSP is supposed to mean or how it relates to this.
Having slept on it I think I see our point re. naming.
.setHTML
/.setUnsafeHTML
has the advantage of making code review and static code analysis easier.
Regardless of the details of unsafe parsing, please don't delay the specification and implementation of the safe subset which provides clear benefits in terms of functionality and safety.
Anyway, assuming there is some compelling reason to duplicate the set of methods rather than have the
sanitize
member accept "unsafe-none
", [...]
I've explained this aspect in more detail here, where this was raised seperately.
I've also commented in the issue that @otherdaniel mentioned about the safety expectations from setHTML()
. (Oh no. Forked threads!)
Something that I didn't note was that if we want to tell people to "Use setHTML()
instead of innerHTML=
and you fixed XSS" then defaulting the new method to allow shadow roots will be quite a surprise. In terms of authoring expectations, I'd lean towards making that opt-in rather than opt-out.
@annevk
It's good that you bring that up. These methods would be available where their non-HTML counterparts are available. Now one thing that the
innerHTML
setter does is branch on the document's type ("xml" or "html") and use a different fragment parser based on that.I think for these methods it would be preferable if we always used the HTML fragment parser. The XML-parser is contextless so doesn't benefit as much from having these kind of positional methods. And a document-based mode switch is also not a great API. Based on that I think the names are still good.
I think mode switches based on the document's type is not great. If we want to have an XML equivalent, it could be a separate method e.g. setXML()
which always parses as XML.
For serialization, we don't have an HTML-only way to serialize, as far as I know. The innerHTML
getter depends on the document. XMLSerializer
always serializes as XML. Maybe this is a separate issue, though.
Something that I didn't note was that if we want to tell people to "Use
setHTML()
instead ofinnerHTML=
and you fixed XSS" then defaulting the new method to allow shadow roots will be quite a surprise. In terms of authoring expectations, I'd lean towards making that opt-in rather than opt-out.
Is this based on some security requirement for the Sanitizer, or merely on a "least surprise" principle when trying to substitute setHTML for innerHTML? I don't think a new setHTML API should be constraint by a legacy API.
I think sanitizing shadow roots is easily resolved; and I thought the Sanitizer requirements were based on the no-XSS threat model.
Something that I didn't note was that if we want to tell people to "Use
setHTML()
instead ofinnerHTML=
and you fixed XSS" then defaulting the new method to allow shadow roots will be quite a surprise. In terms of authoring expectations, I'd lean towards making that opt-in rather than opt-out.Is this based on some security requirement for the Sanitizer, or merely on a "least surprise" principle when trying to substitute setHTML for innerHTML? I don't think a new setHTML API should be constraint by a legacy API.
I think sanitizing shadow roots is easily resolved; and I thought the Sanitizer requirements were based on the no-XSS threat model.
I'd agree that this new API shouldn't be constrained by legacy. And in that case, the most surprising thing would be if it didn't parse DSD. Fundamentally, DSD doesn't introduce any new XSS vectors, right? It's just another container that needs to be examined for XSS-containing content.
I'm late to this thread, but I'd like to push back on the idea that the new setHTML()
method and friends should be the only way to parse DSD. In particular, those methods only parse HTML fragments, and do not provide any way to parse entire HTML documents like DOMParser
does.
For better or for worse, DOMParser
is the web platform's current method of going string -> Document
object. I think it should be extended to support DSD, regardless of any work going on for new methods.
(I don't have an opinion on whether that support should be opt-in or opt-out.)
I agree that there is a need to get a whole document parsed. And DOMParser, as odd API it is, is still quite reasonable API to do it, especially if the behavior is opt-it. It is a bit worrisome what there isn't good story for xml though, but I guess declarative shadow DOM doesn't have that in general.
(setHTML() itself may still have some issues. I'm a tiny bit worried it being slow by default, but I haven't thought through well how to implement that algorithm in a fast way. But browsers do heavily optimize innerHTML. But this isn't about DSD nor DOMParser.)
(setHTML() itself may still have some issues. I'm a tiny bit worried it being slow by default, but I haven't thought through well how to implement that algorithm in a fast way. But browsers do heavily optimize innerHTML. But this isn't about DSD nor DOMParser.)
I had benchmarked our early setHTML implementation and found that execution time was dominated by parsing. The tree walk & node removal were comparatively cheap. There are some caveats: It seems sanitizers are mostly used for small HTML snippets, so that's what I measured. If setHTML becomes prominent it might see quite different usage patterns.
I don't think setHTML()
and friends should be the only entry points for declarative shadow trees, but I'm not convinced that we want to further enshrine synchronous parsing of documents.
I don't think
setHTML()
and friends should be the only entry points for declarative shadow trees, but I'm not convinced that we want to further enshrine synchronous parsing of documents.
I guess two questions:
Note
to the spec PR that says "don't use this API, use... {something} instead" if that would help to avoid enshrining it. Though I'm not sure what you'd put for {something} since there isn't anything else.innerHTML
asynchronous? Typically I've seen most async/sync complaints from developers go the other way: asynchronous APIs are more confusing and difficult to handle, and they'd prefer something synchronous. It's always possible I missed the use case here though.
- Maintenance is needed for everything we decide to keep in the web platform, although there are a few cases where we have done very little of that in the hope that the feature might disappear, e.g., mutation events. Thus far without much success. I don't think the proposal in question here counts as maintenance though.
Interesting choice of example, mutation events. They didn't disappear. And I guess that's the point - "enshrinedness" isn't strengthened by small changes, and isn't weakened by not touching the feature. So it's "ok" to make changes to APIs we don't like, particularly if no alternatives exist yet. Migrating usage to another API really depends on the quality of that other API and the effort spent trying to move folks over to it. Nobody checks whether an API has been modified lately to see if they should use it. Likely on the contrary - an API that has been the same for a long time feels pretty safe to use.
- A way to stream content into an element #2142. (I was primarily thinking of this for documents, but the request indeed exists for elements as well.)
Thanks for the link. It's a reasonable request. I still think the (vast?) majority of use cases want a fast, synchronous parser. Are you suggesting an API that can do either sync or async parsing?
@mfreed7 I think adding features to mutation events would be a mistake. So I don't think we're on the same page here. (Maintenance is very different from adding features.)
I still think the (vast?) majority of use cases want a fast, synchronous parser.
For an HTML document? Pretty much all use cases I'm familiar with want contextual parsing.
Parsing a full HTML document is pretty common for client-side rendering. E.g. this hacked-together demo with easy to read source code, or every app that uses Rails Turbolinks (including, I believe, GitHub.com).
Regarding adding features to old APIs: I don't think we should think of DSD support as a "new feature", necessarily. When you parse a HTML document, you parse <template shadowrootmode="...">
a certain way, per the HTML spec. That's the "normal" path. We've added some "exceptional" path for some reason I don't fully understand, but I think we should ensure the "normal" path is supported as well in DOMParser
.
In other words, I think it's bad if there's no way for web developers to parse HTML documents in JavaScript the same way the browser does during a navigation. If you don't give that ability from DOMParser
, that feels like removing a feature.
In other words, I think it's bad if there's no way for web developers to parse HTML documents in JavaScript the same way the browser does during a navigation. If you don't give that ability from
DOMParser
, that feels like removing a feature.
+1. I like how you phrased that. I agree.
That would argue for changing XMLHttpRequest
as well. I still think it's reasonable to wait with this for v0 until we see more of a concrete demand.
That would argue for changing
XMLHttpRequest
as well.
My original proposal added an opt-in to XHR also. This was discussed, but since there's not a great place to put the opt-in option, it was vetoed.
I still think it's reasonable to wait with this for v0 until we see more of a concrete demand.
Ok. Can you comment on what form of demand would convince you?
From the March 9 spec triage meeting, here are the notes:
Action Items
Per the above notes, we had fairly general agreement in the room that it was "ok" and likely the right path forward to extend DOMParser
with an opt-in for DSD, especially given that there is no alternative. Based on that, I'm wondering if we are ok resolving this issue and landing the spec PR?
There was also some good discussion in the "new parser entrypoints" category, which mostly relates to https://github.com/WICG/sanitizer-api/issues/184. My take was that there was rough consensus on these:
- Simon to open an issue about streaming sanitized parsing, with pointers to help Daniel understand how we could build that in the future.
Can you comment on what form of demand would convince you?
E.g., userland workarounds/packages/libraries, highly-upvoted Stack Overflow questions.
Question: are we OK adding a sanitizing method where sanitizing cannot be turned off?
Could someone give more context about this? Previously I've understood this to be about static analysis where you also have to trust the web developer. But if you have to trust the web developer, can't the static analysis require particular things about the syntax being used? I have a hard time wrapping my head around this requirement.
To add some more context this would be the first time that I know of where static analysis (plus trusting the web developer to not try to fool the static analysis) forms the basis of web platform API design. I think that needs a fair bit of justification.
Question: are we OK adding a sanitizing method where sanitizing cannot be turned off?
Could someone give more context about this? Previously I've understood this to be about static analysis where you also have to trust the web developer. But if you have to trust the web developer, can't the static analysis require particular things about the syntax being used? I have a hard time wrapping my head around this requirement.
To add some more context this would be the first time that I know of where static analysis (plus trusting the web developer to not try to fool the static analysis) forms the basis of web platform API design. I think that needs a fair bit of justification.
From the Sanitizer & security perspective, having a method that actually sanitizes is a requirement in its own right. A security function should not have a built-in footgun with which it disables itself. The Sanitizer API group has, for more than two years, held to the security principle that the Sanitizer API should defend a baseline security guarantee, regardless of how it is configured.
From a more general perspective, we find it questionable to add new methods that add to the XSS surface, and that seem to add little over the legacy methods they replace. XSS is a problem which has plagued developers for decades and we shouldn't lightly add new XSS surfaces to the platform. The legacy use cases seem to be adequately handled by the legacy APIs.
The extensible web manifesto calls for new low-level primitives that can explain higher-level capabilities, which also suggests that we should expose sanitization as a primitive, and not only as part of a larger function.
On static analysis: I'm not sure what "static analysis where you also have to trust the web developer" means.
We (and many others) use static analysis as part of our code presubmit checks. An analysis that checks whether certain methods are called are much easier to do than one which has to establish the possible runtime values of a variable. It's also fairly easy to discover all constructor invocations and to either allow-list them based on some criteria, or to manually inspect them and have the static analysis confirm that there are only a set of known instances. A sanitizer entry point that optionally doesn't sanitize, based on an options bag, makes those analyses a lot harder, since establishing the set of runtime values of an options bag isn't easy to statically determine.
What you can demand of the developer depends a lot on the project state. If you're starting a project from scratch, or the project is small enough that a rewrite is feasible, then you can quite easily make demands on how to use APIs, and then use static analysis to enforce this usage. (E.g., only pass constants to a sanitizer options bag.) That makes things easy. If the project has a large existing codebase, which e.g. occurs when we acquire a company, or use a 3rd-party library, or try to elsehow upgrade an existing codebase, then you'll have to deal with whatever is there, and try to iteratively move the codebase to the desired goal. If in these cases the sanitizer calls are wrapped in code that dynamically constructs the options; or passes around closures or function references that sanitize, then the analysis has to pessimize any such call, which usually makes the analysis useless. I think this case of a pre-existing code base is the more common case, especially outside of large companies like ours.
IMHO, the by far easiest way to resolve all of this would be to disentangle these requirements: Give the DOMParser a 'DSD' option; give the Sanitizer the sanitization method back; and then discuss separately which merits setHTML brings to the developer community, and which API shape would follow from that. That matches established design principles, like the extensible web manifesto, and established user demand, like every HTML sanitizer I know of. It seems the current discussion presumes setHTML
as the sole solution to all of these problems, for reasons that elude me.
That's disappointing to hear. The reason we settled on setHTML()
(and friends) is due to the HTML parser requiring context, innerHTML
being inadequate for declarative shadow trees, and easily abused with +=
, etc.
If you look at this from the extensible web perspective you'd end up with two new operations:
Only the latter under the control of the sanitizer. I proposed going in that direction, but you were against that too.
I'd like to remind everyone of the "No feigning surprise" guideline from the Recurse Center. We're all trying to get to a reasonable design here, and I think we're having trouble at least partly because the design constraints and rejected designs aren't written down in an explainer anywhere. Instead, they're scattered across several issues on several repositories, so we all wind up forgetting that the question we're about to ask or objection we're about to raise has already been answered. And we feel like we've answered a question when the answer we're thinking of from some other thread doesn't actually quite answer it.
Re. static analysis, why is it harder to reject a non-static option bag than non-static method invocation?
Twice bad:
const bypass = "potentially" + "dangerous"
const options = {unsafe: true}
element[bypass](options)
Twice safe:
element.potentiallyUnsafe({...whatever, unsafe: false})
The analysis is marginally more complex, but this is a one time cost when writing the validation logic.
The option bag must be designed with that scenario in mind of course.
Re. static analysis, why is it harder to reject a non-static option bag than non-static method invocation?
It's only marginally harder to write such rejection logic, yes. But the point is that any rejection logic catches both vulnerabilities and benign code, and we think it's best if it can precisely target vulnerabilities. There is usually nothing wrong with passing dynamic values to functions unless one of the function variants has different capabilites, it which case the security analysis (automatic or manual) needs to cover all forms that can potentially trigger this capability.
If the code that is vulnerable looks similar (or even identical) to one that is not:
All of these are ongoing costs for your own code, but the effect explodes with adding dependencies (think: reviewing 100 libraries that liberally use innerHTML
because it was the idiomatic way to parse & insert strings into DOM).
There will always be odd cases that still allow you to "bypass" static analysis (e.g. eval, dynamic examples you pointed out, or prototype pollution), the point is however that API choices affect how a vulnerability appears in the code - I think it's better if it stands out, for aforementioned reasons.
Well it does stand out. I guess the question and point of contention is whether it's our job to ensure that it still stands out when someone creates an abstraction or if that is the responsibility of the person creating the abstraction.
I agree. Passing dynamic arguments to API functions hardly counts as abstraction though. I think it's beneficial if different security capabilities are not coupled within one API function (e.g. it's clear what .textContent
does and what .innerHTML
does, and why both exist).
I think what I don't understand is that if you have to do static analysis anyway and web developers are expected to cooperate, why the web developers wishing to perform static analysis cannot have a safe wrapper and forbid the unsafe variant. Instead this seems to impose certain static analysis requirements on all web platform API designers.
E.g., is window.open(url, "_blank", "noopener")
acceptable given these requirements? Presumably it would have had to be window.noopenerOpen(url)
so you can forbid the unsafe one more easily.
It's not about static analysis alone, rather about modelling API that makes it easy to reason about the API (for the author, for the manual reviewer, for trainers, for integrators, for dynamic scanners, linters etc.). Here, for example, I don't think relying on a safe wrapper scales well, given that web applications use 3p dependencies.
Could window.open
be better? I guess, Element.setAttribute
is also causing a lot of headaches for our security team (script.src can cause XSS, but most other usages do not). But we have to stay practical, hence the opinion that for a new function that is modelled to be endemic, replaces a very known XSS sink, and can trigger the worst kind of web application vulnerability, the bar is a bit higher.
If you use a dependency outside of your control you also cannot rely on static analysis. In that case you would need some kind of runtime enforcement of your policies, especially as this is not the only API that can do this kind of thing.
Forever doubling the number of parser methods over this seems undesirable. Especially as the proposed API shape seems quite clear and in most use cases should not be hard to evaluate. When a second argument is passed, more scrutiny is needed. (And when it becomes harder, static analysis and other tools can help, just as they can with existing APIs.)
I'm worried about an API shape where
setHTML(string)
is safe,setHTML(string, {sanitizer: "unsafe-none"})
is unsafe,setHTML(string, {sanitizer: computedSanitizer})
might or might not be safe: I find @koto's argument convincing that this is hard for static analysis tools to check, and that even in third-party codebases, it's easier to check for uses of setUnsanitizedHTML(string)
and less breaking to ban uses of element[computedField](string)
.setHTML(string, {newOption: itsValue})
is safe but might be banned by lint checks that took the shortcut of insisting on the single-argument form for safety: I'm really worried about this one, because it sacrifices the future-compatibility we normally get from using an options bag.I'm not so worried about doubling the number of parsing methods over this. We might not even need the beforeUnsanitizedHTML()
, etc. variants, since we don't want XSS vulnerabilities to be common. Those could wait until framework authors actually ask for them. But even if they're necessary, as @domenic argued in https://github.com/w3ctag/design-principles/issues/426#issuecomment-1472928048, new function names aren't obviously worse than namespaced groups of functions, and I think that's equally true when the namespacing is done with an options bag.
Apologies if I've misunderstood the API shape @annevk has in mind; I couldn't find anywhere it was written down.
If the concern is around safety aspects of setHTML
, can the feature set be determined by CSP headers? require-trusted-types-for
could be extended to ensure that setHTML
is passed a sanitizer that correctly returns a trusted type, no?
@annevk suggested the CSP option in https://github.com/whatwg/html/issues/8759#issuecomment-1422344159, and I'd have sworn that @koto answered the suggestion somewhere, but I can't find that answer. My understanding, which the security folks might correct, is that CSP is great for enforcing that a policy is followed (so it'd be good to also have that check), but it's not great for either adopting that policy incrementally or for preventing regressions without user-visible breakage. If presubmit checks can run static analysis to identify regressions before they're committed, we can migrate a single component at a time and avoid failures in production after the migration's finished.
@jyasskin isn't that what report-only is for?
@annevk, say you're migrating a large app with a bunch of components. You turn on report-only
and get a bunch of reports, and start fixing them. When you finish a component, you want to make sure it doesn't regress. You can't turn on CSP enforcement for any particular page, though, because they all include other components that aren't finished migrating yet. So CSP by itself doesn't let you ratchet forward compliance. Static analysis does, since you can turn it on a directory at a time.
I'm splitting this out from the discussion that starts on this comment https://github.com/whatwg/html/pull/5465#discussion_r1042020381 and then jumps to overall issue comments starting here https://github.com/whatwg/html/pull/5465#issuecomment-1383927052. This was also previously discussed at length starting at this comment https://github.com/whatwg/dom/issues/912#issuecomment-725900521. That issue has much more context on why there needs to be an opt-in for DSD at all.
The current spec PR has
DOMParser.parseFromString(html,"text/html", {declarativeShadowRoots: true})
which is the only API that lets you imperatively parse HTML that contains declarative shadow roots.There is a question about whether we need to add this, and whether instead we should make Sanitizer's
setHTML()
the only DSD-aware parser API. That requires some changes to the Sanitizer (https://github.com/whatwg/html/issues/8627), in particular givingsetHTML()
asanitizer: "unsafe-none"
argument that bypasses all sanitization.