Closed evilpie closed 7 months ago
CC @annevk
Seems good, but consider using the canonical property names localName
and namespaceURI
instead of inventing new ones.
I think this is okay given that for MutationRecord
we use attributeName
and attributeNamespace
. I'd even go so far as saying I'd favor sticking to "name" and "namespace" for new APIs, making it clearer that's what element and attribute identity is.
I don't see ElementName
defined.
What's the reason for keeping attributes and elements separate?
I don't see ElementName defined.
ElementName
was supposed to be SanitizerElement
. Frankly all the names are just something I quickly came up with, we should definitely spend some more time thinking about them before putting them in the spec.
What's the reason for keeping attributes and elements separate?
A couple of reasons. We think it's making a bit more ergonomic for a couple of use cases:
If elements+attributes were combined in a joint structure, attributes would always have to be copied or specified twice.
I like the new proposal. I'll try to spec it out next week, and maybe start with a trial implementation so I can try it out.
dictionary ElementWithNamespace { required DOMString name; required DOMString namespace;
I think this would need to be _namespace
(namespace
is a terminal symbol), but I also think following Node
/Element
and using localName
and namespaceURI
would make more sense.
I would also make the namespace URI nullable. It'd be nice to make it optional, with a default that's the same as what we would use if you're using the DOMString
member of SanitizerElement
, although I guess that's hard because it would depend on the type of the document?
required DOMString name; // We aren't sure how to specify |any| attribute yet. // maybe: optional bool defaultAttribute;
I think using Star here would make sense. I find having both name
and defaultAttribute
(allAttributes
?) a bit weird, you'd have to add prose about the interaction between them.
required DOMString name; // We aren't sure how to specify |any| attribute yet. // maybe: optional bool defaultAttribute;
I think using Star here would make sense. I find having both
name
anddefaultAttribute
(allAttributes
?) a bit weird, you'd have to add prose about the interaction between them.
This is indeed a thing where we were a bit unsure. If I remember correctly, @otherdaniel also preferred a variant of name: '*'
and we're not totally against that.
I had a personal preference for a named switch, in order to make it more explicit and not raise questions like Does *
fall back to elsewhere-allowed stuff? Does it directly fall back to the specified default attributes?
That being said, we're open to feedback from folks with more experience in designing & shipping new WebAPIs. We do not intend to go against the grain if there are existing patterns and are keen to learn more.
@petervanderbeken what's your take on explicit/implicit here? @annevk any opinions about wildcards / explicit flags?
I toyed around a bit with this and think this is a workable proposal. One item I stumbled on is that error handling becomes a bit weird. E.g. if the name in the name/namespace dictionary is required you get weird error handling: Putting some rubbish into the list of elements usually passes (by ignoring said rubbish); but if you put in a dictionary without "name" key in it, then WebIDL can't make sense of it and the Sanitizer construction fails with an exception. That struck me as rather unexpected.
If possible, I'd like some feedback on "web-by" error handling, too. So far, we just ignore un-parseable config items, meaning that in the worst case our safe-by-default defaults will apply. The idea is to not block off future extensions to the config. Should we instead throw on any unknown config items? Or does it depend on the type of item?
I would also make the namespace URI nullable. It'd be nice to make it optional, with a default that's the same as what we would use if you're using the DOMString member of SanitizerElement, although I guess that's hard because it would depend on the type of the document?
I'm in favour. It wouldn't depend on the document type IMHO, but on whether it's an element or attribute. HTML namespace should be the default for elements; the empty namespace for attributes.
This is indeed a thing where we were a bit unsure. If I remember correctly, @otherdaniel also preferred a variant of
name: '*'
and we're not totally against that.
Yes, I think name: '*'
(or to just consider an absent name
key to match any name) would be a bit more author friendly. But arguably it's not a big difference.
I like Peter's idea of using a DOMString or dictionary with name+namespace for the attribute name. However the duplication of name does look a bit strange to me: {allowAttributes: [{name: {name: "id", namespace: null}, elements: "*"}]}
One addition to consider: To support the use case of "grab bag of elements and attributes", it would be nice to specify attributes as a simple list, just like the elements. E.g.:
allowElements: ["div", "span"],
allowAttributes: ["class", "style"]
This largely aligns with the existing proposal, but requires defining SanitizerAttribute
as (DOMString or ...)
, just like SanitizerElement
already is. (I wouldn't mind making elements
entirely optional so that this would also hold for namespaced attributes. But given how few of those exist that doesn't make much of a difference.)
{allowAttributes: [{name: {name: "id", namespace: null}, elements: "*"}]}
Why not flatten this out as {allowAttributes: [{name: "id", namespace: null, elements: "*"}]}
?
Why not flatten this out as
{allowAttributes: [{name: "id", namespace: null, elements: "*"}]}
?
Good question. That was actually in my original proposal. I thought that Peter was suggesting we remove the flattening and also use either a DOMString or an object with name + namespace like for elements, but maybe I am just imaging that now. I am open to both solutions.
Aside: My personal opinion is that as long as we can use just a string for what I suspect is the most common use case, the explicit syntax with namespaces doesn't matter much to me. At that point I would even accept name: { localName: "id", namespaceURI: null }
.
We just had a sanitizer meeting and I think we all comfortable with using syntax proposed in the initial https://github.com/WICG/sanitizer-api/issues/181#issue-1451286702. That means:
name
and namespace
.elements
key, so we just put name
and namespace
at the same level. (aka the "flattened syntax") namespace
is optional and defaults to null
, meaning the null namespace.(If elements
defaults to "*
" you could allow allowAttributes: ["id"]
to globally allow the id
attribute, but maybe that's best after some analysis of what people actually do in the wild.)
We discussed this again in the call and agree on this syntax.
As a side note, @annevk mentioned we can future-proof the syntax for future improvements by using element-name validation that already exists in createElement
@annevk wrote:
If
elements
defaults to "*
", you could allowallowAttributes: ["id"]
to globally allow theid
attribute
+1
Updated WebIDL based on what was discussed in #199. This doesn't quite match #196, because that seems to contain the star "*" syntax again, which AFAIK didn't really go into.
dictionary SanitizerElementNamespace {
required DOMString name;
DOMString? _namespace = "http://www.w3.org/1999/xhtml";
};
// Used by "elements"
dictionary SanitizerElementNamespaceWithAttributes : SanitizerElementNamespace {
sequence<SanitizerAttribute> attributes;
sequence<SanitizerAttribute> removeAttributes;
};
typedef (DOMString or SanitizerElementNamespace) SanitizerElement;
typedef (DOMString or SanitizerElementNamespaceWithAttributes) SanitizerElementWithAttributes;
dictionary SanitizerAttributeNamespace {
required DOMString name;
DOMString? _namespace = null;
};
typedef (DOMString or SanitizerAttributeNamespace) SanitizerAttribute;
dictionary SanitizerConfig {
sequence<SanitizerElementWithAttributes> elements;
sequence<SanitizerElement> removeElements;
sequence<SanitizerElement> replaceWithChildrenElements;
sequence<SanitizerAttribute> attributes;
sequence<SanitizerAttribute> removeAttributes;
boolean customElements;
boolean unknownMarkup; // Name TBD!
boolean comments;
};
Edit: 16 October. Removed Sanitizer SanitizerAttributeNamespaceWithElements
. Removed allow
prefixes. DOMString?
element namespace
.
15 November: #200 resolved, s/flattenElements/replaceWithChildrenElements/.
Thank you @evilpie for putting that together!
namespace
should probably always be DOMString?
even though that is silly, for compatibility with namespaceURI
.SanitizerAttributeNamespaceWithElements
wasn't needed for the MVP as it was just syntactic sugar.allow*
members should drop the allow
prefix.unknownNodes
or unknownElementsAndAttributes
might be cleaner?
- namespace should probably always be DOMString? even though that is silly, for compatibility with namespaceURI.
So what would happen for null
? Is it just treated as the HTML namespace, so the same as undefined
? AFAIK elements always have a namespaceURI.
- I thought we had decided in the meeting that SanitizerAttributeNamespaceWithElements wasn't needed for the MVP as it was just syntactic sugar.
I forget about that... I just looked at our notes and we did decide that! I've updated the IDL.
- I think the remaining allow* members should drop the allow prefix.
Makes sense to me. I just kept that.
- Not sure we have precedent for "markup" in APIs.
unknownNodes
orunknownElementsAndAttributes
might be cleaner?
unknownElementsAndAttributes
is quite the mouthful. Maybe we could quickly discuss that during our next Sanitizer meeting.
Anne pointed out that it's apparently possible to create elements with a null namespace in XML or explicitly via the API document.createElementNS
. We should probably allow it for future compatibility.
unknown*
also depends on #188.
We have been working on a new syntax that doesn't involve any micro syntax for namespaces, but instead either supports just a string (with an implicit default namespace) or an object/dictionary with an explicit name and namespace.
1. Use case: Elements with no namespace (HTML)
2. Use case: Mixing elements from different namespaces
3. Use case: Attribute + elements
4. Use case: Allow some attribute on any element (e.g. global attributes)
5. Use case: Allowing all attributes on a specific element
For this we aren't sure about the syntax.
6. Use case: Non-null attribute namespace