TypedOM returning an array of strings isn't a helpful behavior

FremyCompany commented 8 years ago

@tabatkins, @shans and myself discussed in another thread the currently specced behavior of TypedOM when the specified value of a property contains a variable reference, or when the property being analyzed doesn't have a type which can be expressed in TypedOM.

Problem

Here are two examples of such situation:

height: calc(42px + var(--foo, 15em) + var(--bar, var(--far) + 15px))

and

--grid-gap-increment: 5px;

Let's figure out you want, as an author, to use TypedOM to animate those values efficiently, you will find out the spec fails to deliver what it promises in the spec abstract:

Converting CSSOM value strings into meaningfully typed JavaScript representations and back can incur a significant performance overhead. This specification exposes CSS values as typed JavaScript objects to facilitate their performant manipulation.

Indeed, calling CSSValue.parse or using getStyleMap on those cases will produce a CSSTokenStreamValue (currently being renamed, my proposal being CSSUnparsedValue or CSSRawValue, see #193).

The problem is that currently parsing the previous "height" declaration returns

[ 
    "calc(42px + ",
    { 
        variableName: "foo", 
        fallback: [ " 15em" ], 
        __proto__: CSSVariableReferenceValue.prototype 
    }, 
    " + ",
    { 
        variableName: "bar", 
        fallback: [ " ", { variableName: "far" ... }, " + 15px" ], 
        __proto__: CSSVariableReferenceValue.prototype 
    },
    ")"
]

The problem is that there is not much a JavaScript code can do with this. It is easier to parse again the string with a custom-made parser than try to interpret the sliced strings, except if you want to update the variable references.

Basically, if you want to modify this value in any way in JavaScript, you are now forced to parse this string on your own, rendering TypedOM impossible to use as it was intended to be used initially in this case, which I do not think is an edge case.

Now, because of the variable references, it would be impossible to parse this entirely normally, of course. The proposal here would be to expose just enough syntax to help authors change numeric values, or replace keywords by another keyword easily.

Proposal

Under my proposal, here is how the returned value for the previous height declaration would be:

CSSUnparsedValue { cssText: "calc(42px + var(--foo, 15em) + var(--bar, var(--far) + 15px))", content: [
   CSSUnparsedFunctionValue { cssText: "calc(42px + var(--foo, 15em)...", functionName: "calc", content: [ 
        CSSLengthValue "42px", 
        CSSUnparsedValue { cssText: "+" }, 
        CSSVariableReferenceValue { variableName: "bar", fallback: [ CSSLengthValue "15em" ] },
        CSSUnparsedValue { cssText: "+" }, 
        CSSVariableReferenceValue { variableName: "bar", fallback: [ ... ] },   
    ]
}

The full proposal would be as follows:

CSSUnparsedValue would not be an iterable of string; the current iterable would instead live on a property of CSSUnparsedValue called cssTextSlices. That would be a L1 change.
In addition, in either L1 or L2, CSSUnparsedValue would get a content property, whose value would be an array of CSSUnparsedValue (or null if CSSUnparsedValue cannot be divided in smaller parts).
Inside content, values are at least broken apart by AST roots: 5px url(...) center would become three values featuring their own cssText: 5px, url(...) and center. We can do this without loss of generality based on whitespace and block processing in css-syntax.
values like numbers and numbers+units get parsed properly based on their unit if it is known, or as CSSUnparsedMeasure { value:number, type:string } if not
blocks and functions can be introspected (blocks as CSSUnparsedBlock { openingChar:string, content:CSSUnparsedValue, closingChar:string } and functions as CSSUnparsedFunction { functionName:string, arguments:CSSUnparsedValue })

Everything else ("red", "#foo", ...) remains a dummy atomic CSSUnparsedValue value that can only be analyzed through cssText. That would be sufficient to solve the efficiency problem when you need to update the numeric value of something and needs to resort to string manipulation to do so now.

Calls to action:

What do you think about that?
Is there anyone else who cares about the issue?
Counter-proposal

It has been suggested that, instead of supporting free-form parsing instead of string arrays, a solution would be to use a var(...) for each variation point inside the value, so that the TypedOM would be used on simple values only that can be defined using CSS Typed Custom Properties, and provide a full TypedOM experience instead of a degraded one.

To allow to change the expression that contains the var(...), the spec would be changed so that expression would end up being parsed; for colors that would be something like CSSColorValue { red: 200, green: 150, blue: CSSVariableReference{variableName:...} } (details tbd)

That doesn't solve issues where you need to extract values out of a property that is not representable as simple values, but this solves enough of the problem to be an appropriate resolution of this issue.

[EDIT] COUNTER-PROPOSAL APPROVED

Tab's counter-proposal was accepted as a valid fix for this problem. The remaining purpose of this issue is to track proper resolution in TypedOM (most probably L2) when we get around writing that second level.

shans commented 8 years ago

Quick note, your second example isn't one we should care about because authors who want a typed representation would simply register the property as a length-typed property using the properties and values API.

FremyCompany commented 8 years ago

In the end game, I can envision it. But as you mentioned, many value types cannot be represented at this point, and I am not sure we will ever create enough parser entry points to represent any possible css property syntax.

Say I'm interested in building a css-grid polyfill (or a new css-grid-like layout engine), it would be pretty hard to create a parser you can declaratively teach to parse a --polyfilled-grid-template-rows, for instance.

With this proposal, I could still get something usable which would not require me to build a css parser.

tabatkins commented 8 years ago

It seems like this could be addressed just as well by us explicitly exposing the "parse as X property" function. Then you can just fill in the variables as you find them, concat everything to a single string, and pass to that function, getting back the result you want.

Regarding your actual proposal, I'm not seeing how it helps much. It gives you the gross structure of the value, but nothing else. What are you planning to do with the value based on that?

It also does precisely what I said I didn't want - exposes a partial parse, where some values are recognized and given real Typed OM values, while others are left as basically strings, based solely on whether CSS grammars generally consider them unambiguous or not. This sort of complexity isn't future-friendly, and I don't think is particularly author-friendly, either.

(Your proposal is definitely possible - we do tokenize the value, and do a generic parse (per Syntax - assemble blocks/functions, but no grammar-checking), and could expose things like that. I just don't see what it really gets you. Beware attractive complexity without underlying use-cases!)

FremyCompany commented 8 years ago

I hear you. I do have use cases, though. My two use cases are:

Use case 1: When there is a calc expression involving a variable, I might want to animate some part of it and right now I can't -- I'm forced to parse the calc(...) string and reconcat everything to be set as a string (because there is no structure that exists which would help me set this as a typed value).

The intermediate structure I propose -- while not perfect -- would allow me to manipulate the structure to change just the part I need and set the property to this modified value, avoiding the use of strings altogether. Most of the times, what you need to change in a loop is a numeric expression.
Use case 2: When I polyfill things like css-grid, there is no native property with a syntax close enough to help me in this case.

This basic syntax parsing would give me everything I need to build a light-weight transpiler to data structures that are useful to me, instead of forcing me to roll an entire css parser just for the purpose of understanding the basics which the browser already understands.

I believe the first use case is definitely a primary target for the spec. The second one is a requirement for me personally to use the API in most of my projects, but I can understand it might look like a secondary target only for the spec.

tabatkins commented 8 years ago

When there is a calc expression involving a variable, I might want to animate some part of it and right now I can't

Can you elaborate on this? What do you mean by it? You're doing pure-JS animation-by-hand? Why not just use normal animations, or more variables?

When I polyfill things like css-grid, there is no native property with a syntax close enough to help me in this case.

Cool, that's an argument more for giving you access to a generic parser, which we agree we want to do. I don't think it's appropriate to try and solve this need in the OM structure for var-containing values, tho - this is a broader request that you'll also have for plain strings.

FremyCompany commented 8 years ago

You're doing pure-JS animation-by-hand? Why not just use normal animations?

For instance, if I am reacting to the user touching the screen to progress the animation (swipe to dismiss, page curl animation, etc). I think this is one of the general use cases that the Typed OM spec targets.

Why not just use [...] more variables?

The proposal of using another variable is interesting and probably viable :-) That being said, it looks more like a work-around than a solution. TypedOM is supposed to give me access to object representation of values, not force me to decompose my css into a bunch of variables to make editing possible at all.

While possible, it might also be impractical in some cases; if what you are animating is the values of a transform matrix, you would end up with 16 custom properties representing the 16 coefficient of the matrix.

[Post Scrptum on use-case 2] I think a "parse as X property" is already available in the spec (see CSSValue.parse), the only problem is that you have to split the string yourself in chuncks other properties can understand.

This is something the minimalist structure would do for you. If I was to create a property whose content could be matricial arithmetic (e.g. scale(2) * matrix(...) + matrix(...)), I could use my proposal to identify the matrix/scale/translate/etc CSSUnparsedFunctionValue(s) and parse them as transform using CSSValue.parse to get the matrix object I need. If I just had strings, it would require me to parse it to find out what I can delegate to "transform" in the first place.

tabatkins commented 8 years ago

For instance, if I am reacting to the user touching the screen to progress the animation (swipe to dismiss, page curl animation, etc). I think this is one of the general use cases that the Typed OM spec targets.

Okay, so you want to be targeting the property at a level where variables aren't an issue. You don't want to have to go digging around inside an unparsed or partially-parsed value-with-holes-for-variables data structure just to do this. Either read from the computed value, or add an additional variable, give it typing information, and manipulate that in your JS. Then you get good, fully-parsed Typed OM objects.

That being said, it looks more like a work-around than a solution. TypedOM is supposed to give me access to object representation of values, not force me to decompose my css into a bunch of variables to make editing possible at all.

What we keep repeating to you is that prior to computed value, we can't do any significant Typed OM stuff because we don't know what the types are. You're trying to bring up examples where you just need to poke at a dimension or something, but in many cases it is literally impossible (without full information about how the page uses the variable) to tell what a given token represents in a pre-substitution value.

For example, box-shadow: red 2px 2px var(--foo) 0px; looks like the --foo var is just setting the blur radius, and so the last length is the spread distance. But it could be --foo: , blue 3px;, so that last length is actually the vertical offset on a second shadow. Is this silly? Of course! But it's not something we can ignore, and you shouldn't either.

So no, using a separate var is not a workaround, it's exactly how this is supposed to be done.

(If we have typing information for all the variables in a property, we should be able to go ahead and do a full grammar match/decomposition, and then present you with the full normal Typed OM value you'd get without variables; some of the leaf values would just be vars rather than the expected type, is all. That's something I'm happy to expand this to in level 2, as it's fairly complex for implementations.)

This is something the minimalist structure would do for you.

Again, I explicitly said that having a "generic parse" exposed is a Good Idea - it's part of the hand-wavey Parsing API we want to do. As the example you provide points out, there's nothing about this request that is var-specific - you're not using a single var in scale(2) * matrix(...) + matrix(...), but you still want it broken down for you into reasonable chunks.

So this use-case has nothing to do with vars, and doesn't belong in this thread. Move it to a collection of Parser API use-cases. ^_^

shans commented 8 years ago

If we have typing information for all the variables in a property, we should be able to go ahead and do a full grammar match/decomposition, and then present you with the full normal Typed OM value you'd get without variables; some of the leaf values would just be vars rather than the expected type, is all.

Even that is hard - for example calc(var(--foo) + var(--bar)) can't simply represent the variables as leaf nodes even if --foo and --bar are guaranteed to be lengths, because the units are unknown. It's something we can look at though.

tabatkins commented 8 years ago

Yeah, we can only do full leaf decomposition when the OM decomposition is based on position + type. calc() is a special-case and we'd have to represent it as a CSSUnknownFunctionValue or something, which just said that it was a "calc" function and had the string+var list. (Same as if we exposed the generic parser I'm talking about and we encounter any function at all.)

FremyCompany commented 8 years ago

For example, box-shadow: red 2px 2px var(--foo) 0px; looks like the --foo var is just setting the blur radius, and so the last length is the spread distance. But it could be --foo: , blue 3px;, so that last length is actually the vertical offset on a second shadow.

Of course, but why do you always bring up this? I fail to understand how this affects my proposal... My proposal literally would have returned an unparsed {cssText,cssTextSlices,content} value whose content is an array of 5 values (a bare one of "red", a {cssText,value,type} one for the "2px"/"0px" and a {cssText,variableName,fallback} one for "var(...)".

If me, author, knows that I will set the variable to a length, I can go ahead and update the next numeric component by setting its value to something else, then reassign the whole thing to the style. I will have to formally make that assumption, no "pre-parsing as a list of box-shadow" will be made for me, but I am still free to write code based on that assumption and modify the raw data.

Now, I hear you are not a huge fan of this, I'm just telling you this is a behavior that users of your API are likely to find useful because it enables them to not ship their own parser. I cannot take the computed value where variables have been replaced and set it as the specified value, because I would loose the variables references in the process.

Again, this is probably fine as a v2 addition, but I believe this is something you will find limiting when trying to use the api in the wild.

If we have typing information for all the variables in a property, we should be able to go ahead and do a full grammar match/decomposition, and then present you with the full normal Typed OM value you'd get without variables; some of the leaf values would just be vars rather than the expected type, is all.

This would provide a solution to my first use case, but it looks more difficult to me than you seem to think it is. I might be wrong, though...

tabatkins commented 8 years ago

If me, author, knows that I will set the variable to a length, I can go ahead and update the next numeric component by setting its value to something else, then reassign the whole thing to the style. I will have to formally make that assumption, no "pre-parsing as a list of box-shadow" will be made for me, but I am still free to write code based on that assumption and modify the raw data.

You, writing the entire page and never making mistakes in your code, are not the only audience we're optimizing towards. We should worry about multi-person teams, or you interacting with older code you wrote and no longer fully remember, etc. And in these situations, making assumptions about how an anonymous token will be used is dangerous and error-prone. And, as I keep pointing out, at absolute best, we can still only offer a useful representation for a handful of values (mostly just numbers and dimensions). For a lot of other values we can't do anything more useful than a string, and that's dumb - dealing with colors in arbitrary formats is just as annoying as having to parse CSS directly. Optimizing for this situation is throwing good money after bad.

On the other hand, just adding another var for the spread value, giving it a type, and then manipulating that is (a) self-documenting, (b) way easier (property access, rather than diving deep into a data structure, and (c) actually gives you proper Typed OM value, regardless of type - we can tell for certain whether something is a length/color/etc and give you the right objects.

This would provide a solution to my first use case, but it looks more difficult to me than you seem to think it is. I might be wrong, though...

Most grammars can decompose based purely on type information, without having to introspect on the actual values. (Check out your browser's parsing code sometime!) It's not trivial to implement, but it's not hard either; mostly it'll be slogging thru duplication, or else making our property parsers smarter in general so we don't have to duplicate things.

FremyCompany commented 8 years ago

I don't like making an api less useful than it could just because someone hypothetically might get it wrong, but the "external variable" for value-customization points + "accept to parse values with variables references" counter-proposal you made would solve my main problem with the current situation so if you feel more willing to pursue this approach than mine, that is an acceptable resolution to this issue.

My general parsing problem could technically be solved by a general-purpose parsing api like css-parse, but I am sad I won't be able to use the result of this parsing directly into the typed om and will need to convert those things back to strings to apply them on the dom; that means my polyfills will not be able to take advantage of the efficiency of TypedOM when I've to support shorthand properties, but okay, that's not the end of the world.

tabatkins commented 8 years ago

I'm not arguing to make something less useful, I'm arguing that it's not worth adding complexity for a very questionable use-case we don't want to encourage when there's a much better alternative. This is a standard trade-off we make all over the platform; we could add complexity to a lot of things for all sorts of reasons, but we have to carefully budget it out.

FremyCompany commented 8 years ago

[EDIT] Updated header post to include your proposal, to save people some time if they read this now; feel free to do any edit on that part if you feel like if you want to clarify something

FremyCompany commented 8 years ago

I think we should leave this issue open until TypedOM 2 has some text for your proposal, just to track progress. No need for any input of mine, I'm ok with intended future resolution; but as long as the change hasn't been made, the issue is still open.

w3c / css-houdini-drafts