Open LeaVerou opened 2 years ago
cc @tabatkins since you're the grammar person
I actually think the centralization (and attendant maintenance burden) is a good thing, because it means you can find all the instances of a production in one place, rather than having to know all the places it's defined in. This is the same issue with partial
in WebIDL, which has both good and bad parts to it.
For instance:
and makes it more likely for these definitions to get out of sync as tokens are added or removed (case in point: I just added oklab() and oklch() to the
grammar because they had not been added there too).
This proposal wouldn't actually help with this; if you can remember to write the <color> |= ...
line, it's exactly equivalent to instead remember to update the <color> =
line. If you were going to forget one, you're going to forget the other.
We've seriously considered removing partial
from WebIDL and instead requiring specs to just update the core definition, but it's always been rejected because there's just so many specs altering certain core interfaces like Document
, so it would end up kinda unreadable. But that's not the case with CSS definitions; our stuff is pretty well centralized and we don't add new productions to things that often.
In the rare cases we have that are analogous, like defining all the things that are <length>
s or whatever, we have a solution already - just say in prose that something is a <length>
. It being a little clumsy is a bonus, because it encourages us to mostly just use the centralized grammar-based approach instead.
In theory, tooling can help with these issues. But there's no reason to rely on tooling when just authoring it directly is similarly easy.
Your argument may work when the changes needed are within the same spec. However, this centralization can be a big problem when multiple specs with varying levels of maturity extend the same tokens. Consider the following:
Spec A, very mature:
<a> = a | b | c
Spec B, Early draft, extends with d
token:
<a> = a | b | c | d
Spec C, Early draft, extends with e
token:
<a> = a | b | c | e
None of these definitions is now correct. Should C take B into account? Should B take C into account? Should A take B and C into account? Depending on their maturity levels, that may or may not make sense. But the maturity level can change at any point, which imposes an undue burden on editors of all three specs because the definition in spec C may need to change even without any changes being made to spec C, just because spec B became more mature.
While this is a simplified example, I cannot count the times where definitions got out of sync because one thing was updated and another thing somewhere else (even within the same spec) was not. Specs are essentially coding in natural language, and the same reasons one modularizes code, apply here too. When humans keep making the same mistake, it's not a good policy to argue that they should "just be more careful", when the problem can be solved with tooling.
I did not check if this issue has been discussed at your last meeting, but I think modular/distributed grammar (value) definitions already exist with the New values
field in some property definition tables in some delta specs, eg. the contain
property:
none | strict | content | [ size || layout || style || paint ]
New values
field in CSS Contain 3 is layout || style || paint || [ size | inline-size ]
Obviously, I took this example on purpose because it seems ambiguous (at least to me). New values
should replace what is inside [ size || layout || style || paint ]
, isn't it? To be fair, it may be an isolated case of a wrong (but unspecified, to my knowledge) usage of this New values
field: all other New values
may just work fine when joined using |
to the corresponding "main" grammar definition.
But what if this "main" grammar definition is a || b
and a delta spec wants to define c
as its New values
? It would be a hint that a "modular" grammar definition may also require ||=
, noting that there would be nothing wrong with a || b | c
as the result from extending a
as the "main" grammar definition, with b | c
as its New values
. But how do you extend a "main" grammar definition containing whitespace (or &&
) separated tokens/productions?
The following may be another isolated issue. CSS Sizing 4 extends inline-size
and width
(among other properties) with stretch | fit-content | contain
as their New values
, and CSS Logical 1 defines inline-size
with <'width'>
. When processing the grammar definitions extracted by @webref/css
from the specifications, I concatenate New values
to the corresponding "main" value definition using |
as the "glue" (which may be a wrong assumption, as noted above). Therefore it can result to parsing an input for inline-size
twice against stretch | fit-content | contain
. The value definition of inline-size
(and some other logical properties) should not be extended with stretch | fit-content | contain
, imo.
Finally, I think a delta spec may need to "rewrite" a grammar definition instead of extending it (while still preserving back-compatibility) for other different reasons, making the usage of a modular grammar definition inconsistent.
The CSS Working Group just discussed modular/distributed grammar definitions
.
There are a lot of cases where a certain value type / token consists of a disjunction of potential tokens, each defined in a separate section of the spec. However, there are still sections with "main" grammar definitions, that need to be updated every time new tokens are added or removed.
E.g.: Images:
Color 4:
Color 5:
This is a maintenance overhead for editors, and makes it more likely for these definitions to get out of sync as tokens are added or removed (case in point: I just added
oklab()
andoklch()
to the<color>
grammar because they had not been added there too).It should be possible to define a subtype entirely by adding a section, without also having to update centralized grammars. This also makes it possible to extend types in separate specs or levels.
Perhaps in addition to
=
, we could define a|=
operator for grammars that means "whatever this token can be from the rest of the spec(s), plus these tokens". E.g., right now, adding OKLab and OKLCH needed edits not just to add their section but also to extend<color>
and in Color 5, to extend<color-space>
.If
|=
existed, the entirety of the grammar changes that these introduce could be self-contained in their section:One could make the point that these summary grammars are useful (though not sure how useful they can be if out of sync). However, Bikeshed could generate them, similarly to how we generated indices, and that way they are guaranteed to be in sync.