Distinguish between simple and complex initial values in data/css/properties.json

jwhitlock commented 7 years ago

In data/css/properties.json, the initial value can be:

A simple value, like 0, 0% 0%, none, or auto. This used to be wrapped like <code>0</code>
A key into l10n/css.json, like noneButOverriddenInUserAgentCSS or noPracticalInitialValue.
A list of strings, which are each one of the above. In practice, all current lists appear to be lists of literals.

I've proposed https://github.com/mozilla/kumascript/pull/162 to distinguish between cases 1 and 2, but it would be better to have a clearer signal in the data. Some straw man suggestions:

Different Keys for literal vs. prose initial values. Advantage - unambiguous. Disadvantage - users must update.

"align-self": {
    "initialLiteral": "auto"
},
"all": {
    "initialProse": "noPracticalInitialValue"
},
"animation": {
   "initialLiteral": [
      "animation-name",
      "animation-duration",
      "animation-timing-function",
      "animation-delay",
      "animation-iteration-count",
      "animation-direction",
      "animation-fill-mode",
      "animation-play-state"
    ]
}

Prefix for prose value, none for literals. Advantage - same key, quick check for literal vs prose. Disadvantage - pick a universal prefix, users must process to get prose key.

"align-self": {
    "initial": "auto"
},
"all": {
    "initial": "_prose_:noPracticalInitialValue"
},
"animation": {
   "initial": [
      "animation-name",
      "animation-duration",
      "animation-timing-function",
      "animation-delay",
      "animation-iteration-count",
      "animation-direction",
      "animation-fill-mode",
      "animation-play-state"
    ]
}

Initial objects instead of strings. Advantage - unambiguous, allows for future expansion, mixing literals and prose. Disadvantage - detect object vs list, overly wordy.

"align-self": {
    "initial": {
       "value": "auto",
       "type": "literal"
   }
},
"all": {
    "initial": {
       "value": "noPracticalInitialValue",
       "type": "prose"
   }
},
"animation": {
   "initial": [
          {"value": "animation-name", "type": "literal"},,
          {"value": "animation-duration", "type": "literal"},,
          {"value": "animation-timing-function", "type": "literal"},,
          {"value": "animation-delay", "type": "literal"},,
          {"value": "animation-iteration-count", "type": "literal"},,
          {"value": "animation-direction", "type": "literal"},,
          {"value": "animation-fill-mode", "type": "literal"},,
          {"value": "animation-play-state", "type": "literal"},
    ]
}

ben-eb commented 7 years ago

Agreed, I was relying on the previous behaviour of wrapping the initial literal with <code> also.

https://github.com/ben-eb/postcss-reduce-initial/blob/master/src/acquire.js#L20-L21

Elchi3 commented 7 years ago

Thanks for filing this John! We need a design decision here. I'm leaning towards option two "Prefix for prose value", but I have no strong feeling. Any other opinions?

chrisdavidmills commented 7 years ago

I think I prefer option 2 as well, as it seems like the least amount of effort to maintain and update, and cognitive overhead to understand. There is little in it between 2 and 3, but the syntax of 2 is shorter and easier. I think 1 risks being a bit more confusing than necessary - needing to get the right key as well as the right value.

jwhitlock commented 7 years ago

Two votes for using a prefix, so let's go down that path.

The goal is:

Make it clear to the first-time JSON reader that this is not a literal value, but instead is a key in (for example) l10n/css.json, or eventually a gettext string.
Never collide with an actual literal value

The implementer would write something like:

if (property === "initial") {
   if (value.indexOf('_prose_:') === 0 {
      key = value.split('_prose_:')[1];
      return l10n_strings_replace(locale, key);
   } else {
      return '<code>' + value + '</code>';
   }
}

I'm not sure _prose_: should be the marker that ships. Here's some options that spring to mind:

"initial": "_prose_:noPracticalInitialValue"
"initial": "_prose_key_:noPracticalInitialValue"
"initial": "_text_:noPracticalInitialValue"
"initial": "_text_key_:noPracticalInitialValue"
"initial": "_text_description_:noPracticalInitialValue"
"initial": "_l10n_key_:noPracticalInitialValue"

Any favorites? Other ideas?

Elchi3 commented 7 years ago

__l10nkey:noPracticalInitialValue ?

(I'm thinking of __compat which we use over in https://github.com/mdn/browser-compat-data )

lahmatiy commented 7 years ago

Different keys makes definition complicated and it breaks principle of single source of truth (both fields may to be filled somehow).
There is no needs for implementing any named prefix. Adding a char that can't be used in values (like the colon) should be enough (i.e. "initial": ":noPracticalInitialValue" or "initial": "!noPracticalInitialValue"). Anyway, any prefix looks like a hack, not a solution.
Initial objects looks a good solution. But literal type object should be an optional and a regular string should be an alternative (i.e. "initial": "value" and "initial": { "type": "literal", "value": "value" } should be the same).

When choosing a solution, it's worth considering that the solution must be scalable. The problem described also applies to other fields. For example, media field for all property contains "noPracticalMedia" value (I suppose "none" and "visualInContinuousMediaNoEffectInOverflowColumns" values are special cases too). Therefore solution based on additional keys in definition doesn't work since we'll get too many extra fields.

So my choice is using an object definition for special cases and left others as is. It's better over solution with prefixes for some reasons. One of them, we can define a JSON schema to validate it (I think it would be great to have a schema anyway).

PS While writing the comment, I found some media values have the same problem. I think it should be fixed and I've created a separate issue for that #63

jwhitlock commented 7 years ago

I think we all agree that additional keys is not the solution, no need for further beating of that straw man.

I prefer __l10nkey: as a prefix because:

It is very unlikely to ever be a valid value for a CSS property, HTML feature, etc.
It suggests that it is a key for localization. We have a folder called l10n, so maybe start looking there. It makes it a little self-documenting.
I can think of a few exceptions for your single-character prefixes:
- ! is a boolean operator in JavaScript
- The !important rule in CSS
- The CSS pseudo-classes :active, :link, :hover, and :visited. ::first-letter is a current value in border-width and others.

I don't like optional objects because I don't like having two ways to say the same thing. It makes parsing more complex and can lead to ambiguities. However, it would be an acceptable compromise for readability, since we're hoping for manual contributions. I agree that the short version should imply literal values.

JSON Schema is good for structural validation, but I don't think it is sufficient for validation of these keys into the string structure. For example, I think JSON Schema would be OK with "initial": {"type": "l10nkey", "value": "itsATrap"}, even if the key "itsATrap" doesn't appear in l10n/css.json. An additional validator is required, and that validator works just as well with prefixes as it does with expanded objects.

That's a long way of saying my preference is still a prefix of "__l10nkey:", and writing a validator in addition to the JSON Schema.

jwhitlock commented 7 years ago

See #67 for what the css/properties.json would look like with the prefixes applied everywhere. Some notable things:

It appears a lot of the values are lookup keys.
Values like yes, no, and all are lookup keys, which may be unexpected.

-webkit-border-before and possibly others have a mixed list:

"initial": [
  "border-width",
  "border-style",
  "__l10nkey:color"
],

lahmatiy commented 7 years ago

My thoughts about changes in #67:

First of all it's a dictionaries with constant (enum) values rather than a source for translations. Most "users" will never use l10n.json so prefix to be confusing.
Some values aren't translated so we get a mixed value style with and w/o prefix, it's confusing again.
Adding a prefix to every keyword looks odd.
Values in lists, like a media, stays untouched.
Context has meaning. When array is using (for initial, computed etc) it's a list of references to properties rather than a l10n key, so "__l10nkey:color" in your example is incorrect.

jwhitlock commented 7 years ago

Thanks for your feedback. I think some of it would be clearer on #67, where you can add your comments to specific lines of the changed JSON, and I can refine the PR to be more correct. There's some issues that you have identified that need fixed.

jwhitlock commented 7 years ago

After thinking about it for a while, I'm 👎 on using the prefix, and I like the idea of a different key for prose descriptions more, such as:

"align-self": {
    "initial": "auto"
},
"all": {
    "initialText": "There is no practical initial value for it."
},
"animation": {
   "initial": [
      "animation-name",
      "animation-duration",
      "animation-timing-function",
      "animation-delay",
      "animation-iteration-count",
      "animation-direction",
      "animation-fill-mode",
      "animation-play-state"
    ]
}

So the default of initial means "This is a literal value. Display in <code> in HTML" . The new initialText would mean "This is an English string describing the initial value. Localize it if you wish." This is a bigger change, but I think it will be more compatible with Pontoon-based translations.

SebastianZ commented 7 years ago

@jwhitlock How does the connection to our localization then look like?

Sebastian

jwhitlock commented 7 years ago

I was thinking something like this:

mdn/data just has the English strings.
The string extractor looks for keys with the *Text name, or we have an extra config file that says what keys are English strings (css/properties.json:*.initialText). Strings are stored in https://github.com/mozilla-l10n/mdn-l10n, under data.po files
The KumaScript macros that use the data apply the translated string at render time.
Other users can just use the English strings, or use the KumaScript code as a model for localizing their own UI. If there is demand, we can get the translation files into the npm package, as .po files or as JSON

My plan may change as I implement localization in the KumaScript macros. I plan to get this working for .ejs files first, but we have .json files there as well (overview in GroupData.json looks like it should be localized in some contexts). I'd want to tackle strings in this repo after the main Kumascript repo.

mdn / data

Distinguish between simple and complex initial values in data/css/properties.json #61