New principle: Prefer reusing existing syntax even with validity constraints to introducing new syntax

This most recently came up in https://github.com/w3ctag/design-reviews/issues/955 but it’s something that comes up a lot, especially in the CSS WG.

There is a widespread belief that when introducing a new feature, if you cannot support the full syntax for an existing concept, it’s better to introduce new syntax that makes the restrictions clear, rather than reuse existing syntax some forms of which are simply invalid.

Examples:

The most recent example that prompted this post: introducing a whole new calc-size() function to allow for intrinsic size keywords (auto, fit-content, etc.) to be used in calculations. The reasoning for not doing it in calc() is that implementations cannot support more than one distinct keyword as they follow different code paths depending on the keyword used.
Before we thought we could do :has(), we realized we could do limited forms for specific pseudo-classes like :focus or :target. Instead of supporting :has(:focus) and :has(:target), the WG opted for a completely separate syntax, :focus-within and :target-within. We were able to drop the latter since it had no implementations by the time we realized :has() was feasible, but we will need to support :focus-within forever. Some argue that there are still value in it, as it can be a faster code path, but if that’s the case, it’s exposing implementation issues as UI warts. Implementations can simply short-circuit :has(:focus) rather than need a whole separate pseudo-class for this.

Shadow DOM CSS especially guilty of this:

:host() being a functional pseudo-class instead of allowing authors to simply concatenate selectors with :host like they can do in every other selector scenario. E.g. they have to write :host([size]) instead of simply :host[size] to target a host element that has a size attribute.
:host-context(): A whole new functional pseudo-class to query the host element’s ancestors and siblings. E.g. authors need to write :host-context(.foo) instead of (the far more idiomatic), .foo :host.
::slotted() being introduced as a pseudo-element rather than a combinator because we could not support the entire selector syntax if it were a combinator. As a result, it suffers from several ergonomics issues that the web components community has been repeatedly vocal about.

I’m sure there are a lot more.

There are two reasons I think this is an antipattern.

Usability

A language’s usability correlates strongly with having few primitives that can be combined in many different ways, rather than introducing new primitives for any new combination. Once authors learn about the primitives, they can guess the combinations, but new primitives need to be learned separately.

Furthermore, they will try the combinations anyway, and be surprised when they don’t work. This could be an argument for using new syntax (since reusing existing syntax with validity constraints means some forms of the syntax won’t work) but that is one step further in terms of probability that authors will hit it. E.g. in the calc-size() example above, it’s much more likely that an author will try to combine auto with calc() than to combine multiple keywords with calc().

Evolution

Often limitations that appear intractable at the time, will be loosened or removed later on. If we’ve introduced new syntax to communicate the limitations, we’re stuck with it, and have to support it forever. If we’ve reused existing syntax that simply disallows some combinations, it’s trivial to gradually expand the range of things allowed. This also allows for a more gradual expansion, rather than the all-or-nothing of introducing new syntax that is designed around the current limitations.

Existing established primitives also work better with new features. To use one of the examples above, shadow DOM CSS is a mess when it comes to CSS Nesting, because CSS Nesting is designed around regular selectors, not selectors whose context is in a parenthetical pseudo-class.

Are there any examples of this beyond CSS?

A few comments on the specifics above:

The constraints around calc() or calc-size() aren't specific to implementations; they're about the logic required by existing specifications (whether written like flex and grid and multicol or not-really-written like tables) having branches on value types, and web content depending on that logic.
:focus-within and :target-within are substantively different from :has(:focus) and :has(:target) in that they have different (and probably more desirable) behavior when crossing shadow DOM boundaries. (I'm also still not convinced :has() was a good idea; I don't think the performance concerns about it were "solved", many of them were instead ignored.)

Also a few more general thoughts:

I think teachability of performance characteristics matters. While engines can and sometimes do optimize things, features of the platform do have real performance characteristics -- they require work that takes time. I think designing a platform on the assumption that authors never need to understand or be aware of its performance characteristics is a bad idea. Such a design is likely to lead to slower pages for end users.

I also think exposing things in a way that reflects how the underlying platform works is also (in general, certainly not in every specific case) a good idea. I don't think we should treat browser engines as deep magic that only a small group of people are able to understand. I think it helps with understanding how things are going to work when they're put together, understanding whether things can be put together, and understanding what new features can be made to fit in to the existing system.

And even more generally, I think the issues discussed here are to some degree specific to things that are not programming languages (for, I suppose, a very specific definition of a programming language that I'm confident is not universally agreed on). In a programming language, there is generally a concept of code execution where a piece of code is executed at a clearly defined time or times. Composition of pieces of code (such as variables or function calls) is managed by the author of the code, subjects to well-defined constraints such as those of a type system. Using the result of one function call as the input for another is allowed, and the author of the code is responsible for the performance characteristics of any such use when iteration or recursion are involved. That's not how CSS works; the values specified in CSS are used in very complex ways that are defined across many specifications and that already need to handle many property and value interactions. A CSS declaration is not a piece of code that is executed at a specific time.

I think there are arguments on both sides of the tradeoff of exposing how things work underneath (and helping users of CSS understand it) versus exposing things in a simplified way that covers up what happens underneath but is consistent with other features. How we should make that decision varies case by case, depending on things like how permanent the underlying characteristics are and how useful it is to expose them. I don't think it makes sense to document a principle to always fall on one side of this tradeoff. (I think perhaps you could even construe the extensible web manifesto as calling for always falling on the other side. I don't think that's the right answer either, though I think it makes good arguments that we should bias in favor of exposing primitives when it's not problematic to do so.)

I strongly disagree with this as a principle. I think every single example you've given here, while sometimes showing slightly awkward trade-offs, was ultimately decided correctly, and trying to instead reuse existing syntax in the ways you're suggesting would have been a very bad move. In general I agree with @dbaron here, in that this is something that needs to be decided on a case-by-case basis, and actually should usually lean in the opposite direction by default.

`calc-size()` I'll leave to the TAG review for it
`:has()` works differently with shadows, as @dbaron said. More generally, an important thing to consider for teachability is what the "validity boundary" of some syntax is. If it *looks* like you should be able to write a particular syntax, because that syntax is valid elsewhere, but it's actually invalid in this context, that's a usability footgun. Not always avoidable, but when it's necessary to introduce, the boundary between "valid" and "invalid" syntax in the new context should be *as simple to define/remember as possible*, and ideally should make sense from larger principles rather than just being due to some arcane restrictions of the underlying concepts. In this case, allowing `:has()` to accept a *very limited subset of selectors* would violate that. There's no broadly-arguable reason those selectors, in particular, were chosen to be valid while all the others weren't; it's just that we, the WG, decided the use-cases for those particular selectors were well-justified enough to be worth the impl complexity and perf cost. Especially for things where we'll slowly adjust the boundary over time, for similarly unique/one-off reasons, this is just a very difficult thing to teach. It's also a forward-compat hazard. People *will* try to use other stuff, and when it doesn't work, leave it in their CSS and write more stuff that does work. If previously-invalid stuff suddenly becomes valid, that can break pages, and we've experienced this sort of thing in the past. All of these issues are avoided by (a) minting special-purpose pseudo-classes that clearly do one thing and one thing only, and (b) minting a general-purpose :has() with minimal restrictions on its syntax. Which, luckily, are the things we did.
:host() is the way it is because the host element *shouldn't match selectors normally*. Doing so would be surprising, since the shadow author doesn't have control over the host element's markup, the component user does. And this is a necessary rule - without it, the shadow author either has to code defensively all the time (which doesn't happen), or document every selector assumption they make (annoying and also doesn't happen) or just code as if the host didn't match and sometimes get a broken component thru no fault of their own. It is a general rule in CSS that simple selectors are always *filters* - all they can do is reduce the set of currently-matched elements, and can't change it in any other way. In other words, adding a simple selector to a compound will always leave the set of matched elements the same or remove some of them; similarly, removing a simple selector will always leave the set the same or add to it. If `:host[size]` were allowed, and we still had the restriction that `[size]` wasn't allowed to match the host element, then these assumptions are violated. The presence of the `:host` changes the matching context of the *entire compound selector*: an `[a]` selector might match nothing, but adding `:host[a]` suddenly makes it match something. This sort of context-changing is both unexpected, and *incredibly confusing* when combined with other features. For example, does `[a]:not(:host[b])` match a host element? The "host-ness" is only in a *negation* - does it negate the host-ness of the argument, or carry it thru? In `[a]:is(:host[b], [c])`, would a host with "a" and "c" attributes match? The host-ness is only expressed in the `:host[b]` selector, but does it infect the entire `:is()`? Instead, the current design puts the funky twist in matching logic onto the host element itself (it can't match anything except :host), and then matching works exactly as normal from that point on. When you do need to violate the "doesn't match anything" assumption, it's clear precisely what is being used for that matching, and it doesn't cause any of our assumptions to be violated.
:host-context() has all the same issues as the above, but moreso. We *cannot* match things above the host in general, for perf and architectural reasons, and for general usability reasons (again, the shadow author doesn't control the markup outside the shadow, and might not be thinking about what the page author might be doing), so the only way to even remotely approach this would be to give the compound selector with :host in it special meaning, allowing us to split the selector into "inside the shadow" and "above the shadow" fragments. Then, again, what happens when you use :host twice? This might be a perfectly reasonable selector if the two instances are both in branches of :is() pseudos, for instance. The fact that, for perf reasons, we limit the "above the shadow" selector to a single compound selector just makes it worse, because now there's a rule that :host is allowed to appear in the *first or second* compound of a selector, but nowhere else. The current design, as with :host(), lets you violate the "can't match that stuff" restriction with a very clearly delineated chunk of selector, which doesn't violate our normal assumptions. It also, happily, solves what is otherwise a usability annoyance, which is that the "context" selector can be *on* the host *or above it*, which would have required the shadow author to always write `:is(.foo:host, .foo :host)` otherwise.
::slotted() is the only one that's marginal in this list. The general reasons I've already given about selector assumptions and validity boundaries still apply, but it *does* end up with unfortunate usability problems as a result. There just wasn't a great solution here.

So, in general, I strongly disagree with this as a principle.

w3ctag / design-principles

New principle: Prefer reusing existing syntax even with validity constraints to introducing new syntax #497

Usability

Evolution