whatwg / html

HTML Standard
https://html.spec.whatwg.org/multipage/
Other
8.01k stars 2.62k forks source link

Upstreaming from the Sanitizer API #7197

Open mozfreddyb opened 2 years ago

mozfreddyb commented 2 years ago

We (@otherdaniel and I) want to help with and ensure well-specified interactions between the Sanitizer API and HTML. Specifically, we were thinking of this split:

Sanitizer API

HTML

Please let me know if there are any questions and we're obviously happy to take feedback.

annevk commented 2 years ago

"algorithms for sanitizing a" seems like an incomplete sentence?

Presumably HTML would define the setHTML() method?

mozfreddyb commented 2 years ago

Thanks I fixed the typo. Re, setHTML, I was guessing that it would live in the DOM spec, as this is where the Element interface is defined?

annevk commented 2 years ago

I think the idea still is to upstream most of https://w3c.github.io/DOM-Parsing/ to HTML as that defines most of the infrastructure. The DOM is rather agnostic to parsing/serialization.

(There are some longer standing issues without a clear decision about whether extensions should go on Element vs HTML/SVG/MathElement, and whether partial is a good idea, but let's not block on that. 😊)

domenic commented 2 years ago

This seems reasonable for now, although it's also worth considering the eventual destination of the spec after it graduates from WICG and is looking to become a real standard. I think upstreaming the whole thing into HTML would make a lot of sense in that case.

mozfreddyb commented 2 years ago

What an awful omission on my side, to clarify: The HTML-bits should be in HTML The Sanitizer interface, is likely going to be adopted into W3C WebAppSec.

I don't have a very strong preference here but I do feel a need to reconcile the points that were made in another thread, which has unfortunately derailed quite a bit (https://github.com/w3c/webappsec/issues/596).

I hope the summary is correct and satisfactory for both sides. I'm happy to work with y'all, if not.

domenic commented 2 years ago

The Sanitizer interface, is likely going to be adopted into W3C WebAppSec.

I think that would be suboptimal, as opposed to having its graduation location be HTML. I think solid maintenance is more important than having a small standalone spec; small and standalone is reasonable for documentation, but for a spec it's best to have it co-located with other fundamental HTML infrastructure.

mozfreddyb commented 2 years ago

With a bit of a delay, we've discussed that we will follow your suggestion to move things into the HTML spec directly.

For starters, I considered creating with an element category that explains whether an element should "allowed by the Sanitizer by default" e.g., by following these steps:

@domenic and I'd really like to hear your feedback here :)

domenic commented 2 years ago

A new category makes sense, although since this is not for the content model (i.e., telling what HTML is valid according to conformance checkers) you'd probably want to follow the example of form-related categories, as the spec says:

Other categories are also used for specific purposes, e.g. form controls are specified using a number of categories to define common requirements. Some elements have unique requirements and do not fit into any particular category.

See https://html.spec.whatwg.org/multipage/forms.html#categories for how that manifests. Basically you'd define the category in the sanitizer section, instead of in the main content-model categories section.

Since it sounds like the plan is two-stage, i.e. first define the category and safe section and then when the sanitizer spec is ready to graduate merge the rest into HTML, I'd suggest thinking ahead a bit and introducing a sanitization section which can host both the category and the discussion of safety, and then in the future be expanded to contain the rest of the spec text.

Potential locations include under https://html.spec.whatwg.org/multipage/#toc-syntax, under https://html.spec.whatwg.org/multipage/#toc-webappapis, or maybe a few other places.

otherdaniel commented 2 months ago

We'd like to advance this proposal to stage 2 in the WHATWG Stages process.

(I don't seem to have permission to add the agenda+ label myself. I'll ask for help with that.)