w3c / csswg-drafts

CSS Working Group Editor Drafts
https://drafts.csswg.org/
Other
4.5k stars 666 forks source link

[css-syntax] Trim whitespace around declarations? #774

Closed tabatkins closed 6 years ago

tabatkins commented 7 years ago

Right now, the Syntax spec is clear about what the value of a declaration is - it's all the tokens between the : and the ;. In particular, this includes any initial or final whitespace, which is pretty common in declarations like color: red; or in omitted-semicolon situations like:

.foo {
  color:red
}

When serializing non-custom properties, this whitespace doesn't matter; the property value has been parsed into a specialized data structure, and serialization just works off of that - it doesn't even attempt to preserve source whitespace. This means, for example, that you can be sure the first character of 'color' will be a non-whitespace char.

But the value of a custom property is the token list it parses as, so technically the initial/final whitespace is part of its value and should be serialized with it. Add to this the resolution at TPAC that custom properties should store their source representations so they can reserialize exactly as written (rather than, in some situations, serializing with extra comments inserted). On the other hand, this can result in unexpected whitespace in the serialization, as the initial/final whitespace isn't really considered to be "part of the value". (This is why I'm reporting it - one of the engineers on our team was writing some code with custom properties and got tripped up by this.)

So my question is: do we want to alter the "consume a declaration" algorithm to trim initial/final whitespace tokens? That dodges the resolution, as it won't be part of the value anymore. This will have zero effect on ordinary properties, just custom properties, and will likely bring custom property serialization slightly more in line with author expectations.

FremyCompany commented 7 years ago

+1

tabatkins commented 7 years ago

agenda+ for a yay/nay on this. I don't have a strong opinion either way.

AmeliaBR commented 7 years ago

How would this affect declarations of the form:

:root  { --my-var: ; }
.class { --my-var: }

…which currently is a valid CSS var consisting of a single whitespace token? Is it fine so long as the ; is there, but not in the second case?

The ability to create empty vars is useful when you are adding a variable to a list property:

background-image: var(--element-bg) linear-gradient(black, gray);
transform: var(--base-transform) rotate(45deg);
tabatkins commented 7 years ago

Those two are exactly equivalent - in the second, the declaration extends until it hits the }, so its value is also a single whitespace token in the current spec.

(There shouldn't be any difference between --foo: ; and --foo:;, but I think there technically is today, as <declaration-value> is defined to be a sequence of one or more tokens. I should add a ?.)

I'm not intending to make "empty vars" invalid, just to trim the representation of variables. Under this proposal, both of those would report the property's value as an empty string.

LeaVerou commented 7 years ago

Strong +1 for this. It's very much against author expectations that CSSOM methods return whitespace around a value and it means that --foo: bar and --foo:bar have different values, which could lead to a lot of bugs.

css-meeting-bot commented 7 years ago

The CSS Working Group just discussed [css-syntax] Trim whitespace around declarations?, and agreed to the following resolutions:

The full IRC log of that discussion <astearns> topic: [css-syntax] Trim whitespace around declarations?
<astearns> github topic: [css-syntax] Trim whitespace around declarations?
<astearns> github topic: https://github.com/w3c/csswg-drafts/issues/774
<fantasai> TabAtkins: This has impact on what custom property values are
<fantasai> TabAtkins: Because custom properties capture tokens
<fantasai> TabAtkins: Currently the white space before first value token is preserved
<fantasai> TabAtkins: All of the normal properties serialize out from OM, so they get normalized white space output
<fantasai> TabAtkins: Some people brought up maybe we should trim the white space from beginning and end of a token stream
<fantasai> TabAtkins: This would be a tweak to the Syntax spec
<fantasai> TabAtkins: to throw away this white space
<fantasai> TabAtkins: No consequence for ordinary CSS, they will continue to parse and serialize as usual
<fantasai> TabAtkins: It may or may not have an observable difference on serializing a rule of a style sheet... I think they generally reserialize from internal structures
<gregwhitworth> basically this saves authors from writing trim() when manipulating custom props
<fantasai> TabAtkins: Would have desired difference on serialization of custom property value when ppl write with typical whitespace after colon
<fantasai> TabAtkins: Saves authors from writing trim(), right, and also from forgetting to write trim() because they never have to write that for any other property
<fantasai> leaverou: Is there any use case where you want the white space
<fantasai> TabAtkins: If you're embedding an esoteric language...
<fantasai> TabAtkins: If you're embedding another programming language into CSS you'll have consequences anyway
<fantasai> TabAtkins: Don't think any other issue
<fantasai> leaverou: So benefit to it, and not hurting any use case. So I'm hugely in favor.
<fantasai> leaverou: Every time I use OM for custom properties, have to use trim(), it's really annoying.
<ChrisL> +1 to trimming
<fantasai> astearns: Proposal to trim whitespace on either side of all declarations. In favor / opposed?
<fantasai> RESOLVED: trim white space before / after property value in a declaration.
upsuper commented 7 years ago

If we do this, we should also tweak CSSOM spec to add the whitespaces back during serialization.

tabatkins commented 7 years ago

That's already there, no? https://drafts.csswg.org/cssom/#serialize-a-css-declaration

(This change actually makes the CSSOM round-trip values properly, instead of adding a space to the front.)

heycam commented 6 years ago

(There shouldn't be any difference between --foo: ; and --foo:;, but I think there technically is today, as <declaration-value> is defined to be a sequence of one or more tokens. I should add a ?.)

I'm not intending to make "empty vars" invalid, just to trim the representation of variables. Under this proposal, both of those would report the property's value as an empty string.

@tabatkins the ? never got added to the definition of --* in CSS Variables 1, so --foo: ; and --foo:; are both invalid per spec currently. Did you intend to change that?

heycam commented 6 years ago

The note at https://drafts.csswg.org/css-variables-1/#syntax

Note: While must represent at least one token, that one token may be whitespace. This implies that --foo: ; is valid, and the corresponding var(--foo) call would have a single space as its substitution value, but --foo:; is invalid.

needs updating either way.

prjnt commented 5 years ago

Note that the change to the custom var grammar means that --foo: ; results in --foo having its initial value, thus making background-image: var(--foo) #gradient equivalent to background-image: var(--no-such-var) #gradient per “substitute a var()” rules.

So allowing substituting an empty sequence of tokens requires change to those rules [or perhaps change to the initial value].

Another consideration is that --foo: ; still violates the core grammar, which entails extra work in both spec and implementations.

However, do we absolutely need to allow custom vars to have an empty value? 'background-image' should be doable using a none layer, and 'transform' can of course be done with an identity transform. It's certainly believable that uses exist; I'm just saying that reverting adding the ‘?’ to custom var grammar is an easy option for now, and might be taken if no-one can for now think of a convincing use.

tabatkins commented 5 years ago

Why would it make it have its initial value? "Empty" is not the initial value (but the initial value does, unfortunately, serialize as empty).

tabatkins commented 5 years ago

--foo: ; also doesn't violate the core grammar; it parses as a declaration just fine.

Being able to substitute into nothing is necessary; for example, a variable might control whether a box-shadow is inset or not, and the only way to represent "not inset" is by outputting nothing. ^_^ We've got several instances of this in CSS, where some value is indicated only by the absence of other values. (I don't really like this, and try to design properties to have all states representable by affirmative values, but I don't always win and we've got a lot of legacy properties...)

prjnt commented 5 years ago

Thank you for the inset example. I suppose there's no example where empty token sequence is absolutely necessary, in that no (other) property accepts an empty token sequence, but the inset example at least demonstrates significant inconvenience to disallowing an empty token sequence.

Regarding the initial value, css-variables-1 says

The initial value of a custom property is an empty value; that is, nothing at all.

Combined with the fact that custom property values in effect token sequences, I had taken this to mean an empty sequence of tokens. Possibly this understanding was influenced by the fact that an empty token sequence was not otherwise valid for custom property values at the time that I formed that understanding.

If that is a misunderstanding, then I suggest changing the spec accordingly. (And more generally, changing the initial value is one way of resolving the problem, as mentioned previously.)

As you say, the serialization value remains another problem that would need changing.

The point about core grammar influencing adoption costs was a minor one and I don't want to spend too much space on it here, but css-syntax does not mention the terms core grammar or syntax, which is still defined by CSS2, and I believe that this definition still has a strong influence on parsing code in use today. Its promises that "This section describes a grammar (and forward-compatible parsing rules) common to any level of CSS (including CSS 2). Future updates of CSS will adhere to this core syntax, although they may add additional syntactic constraints." might also be relevant to non-renderer code that handles CSS, such as editing tools. However, as I say, I don't consider this a major obstacle to change.

AmeliaBR commented 5 years ago

@tabatkins your most recent edits to the spec mostly address this. (That is, the new definition of the guaranteed-invalid value, which is now referenced as the initial value of custom properties, with clarifications that it is not the same as a valid-but-empty sequence of parsed tokens.)

However, the sentence @prjnt quotes is still there, in the section on Custom Property Value Syntax. It should be removed/replaced with a link to the new definition.

prjnt commented 5 years ago

Yes, both the guaranteed-invalid change and the change AmeliaBR proposes help.

I suggest adding after the first paragraph in the guaranteed-invalid section a clarification that (despite its name) this guaranteed-invalid value is nevertheless not an "illegal value": i.e. "--foo: initial" (or unset, or inherit) is not an invalid declaration to be discarded. The sentence currently at the end of the second paragraph is relevant to how this change is made (i.e. maybe some things should be reordered; and of course the rest of the second paragraph might need changing depending on the resolution of the serialization problem).

Regarding serialization: a problem similar to the one that Tab mentions for getProperty is that setProperty("--foo", "", ...) is currently defined as being equivalent to removeProperty("--foo"). However, changing the second argument to a space might be enough to avoid that. If so, then I wonder whether getProperty on a custom property should similarly return a space if the value is an empty sequence of tokens, so as to reduce bugs where a subsequent setProperty fails to restore the value current at the time of getProperty.

emilio commented 3 years ago

Also, there is this in the css-variables spec:

The allowed syntax for custom properties is extremely permissive. The <declaration-value> production matches any sequence of one or more tokens, so long as the sequence does not contain <bad-string-token>, <bad-url-token>, unmatched <)-token>, <]-token>, or <}-token>, or top-level <semicolon-token> tokens or <delim-token> tokens with a value of "!".

If we trim whitespace, we should allow zero tokens to be a valid value.