[Meta] Do functions/transforms happen before tokens.json (i.e. to generate it)? Or within tokens.json?

In reviewing a lot of discussion around the DTCG format, one of the biggest continuing debates that seems to pop up in about every suggestion or discussion is the fundamental nature of the JSON file. Is the DTCG spec…

Static token values (that have been generated from some previous GUI/design tool)?
Or is it A set of formulae that must be computed to get the final token values?

Many discussions have their authors taking the position of “static values” or “formulae” (terms I’m inventing just for the sake of discussion). The different approaches by contributors tend to result in different solutions / different proposals / different terminology that has led to some confusion and delay in consolidating on approaches. To give some concrete examples:

The Token Operations proposal (#224) seem to assume DTCG should be formulae
The Color modifiers proposal #88 also seem to suggest formulae
The Modes proposal (#210) operates off the assumption of static values (assuming the values are generated from Figma), and the large discussion it sparked resulted in both formulae supporters and static supporters) both offering responses/proposals to the original concept.

Both the static and formulaic approach are 2 proposals to the same solution: generating a large set of design tokens quickly. But I wanted to write down some thoughts to distinguish the two approaches in the hopes that we could figure out which of the two the DTCG spec wants to follow.

DTCG as Static values

The static values approach takes the philosophy that what’s in the JSON are the final values. Aliases are still allowed of course, but no additional work is needed to figure out the values (i.e. no operations or modifiers are allowed).

Pros

Computationally Efficient: no work is needed to determine token values
Statically-analyzable: Makes token linting possible such as a11y contrast checks (which require static values to work)
Portable/universal: Interop is simple because it’s read-only
Follows prior art: Borrows best practices from existing schema formats like OpenAPI and JSONSchema
Flexible: because all values are static, values can perfectly replicate any system (even imperfect ones)

Cons

Dependent on external tooling: this format can’t be generated directly and must be an output of another tool (which may already be a shared assumption, but still worth calling out)
(Possibly) Inconsistent: some values may have typos or errors (which is a reality of other schema formats like OpenAPI, but still worth calling out)
Difficult to maintain: changing one part of the central design system rules/goals could cascade to all values needing to be regenerated

DTCG as Formulae

The formulaic values approach allows for a few manual configurations to generate an entire design system with ease. It enforces consistency but comes at the cost of a DSL of sorts to work.

Pros

Powerful: an entire design system can be built out of just a few key values (e.g. @ilikescience’s “Crafting colors with functions” idea)
Low maintenance: Only small tweaks are needed to change the entire system, and with little noise
Consistent: values always follow a logical pattern, and are internally consistent/coherent with one another

Cons

Computationally (more) expensive: Since each token can’t be statically-read, work has to be done to compute the values (“run” the functions)
Mentally complex: As some proposals have outlined, calculating the final values of tokens may be too complex and requires flowcharts and large systems to even follow mentally what’s happening
Inflexible: If all values are the result of a function, overrides/one-off values become more difficult/impossible to produce
Manually-written: Can’t be generated by an external tool because itself is a DSL meant to generate values.

Evaluating the two

If that provided enough of a distinction between the two approaches, along with how to classify proposals as one or the other, we should probably ask “which direction should the DTCG spec go in?” And my 2¢ is the major missing piece is there is no current design tool that can generate the DTCG spec in its entirety (probably closest is Figma Variables + TokensBrücke export) and a lot of bets are placed on “will that tool ever exist?”

If your answer is “no, DTCG should be written manually,” the natural inclination is to take a programmer’s approach and try and express formulae for generating values to save time. If your answer is “yes, DTCG should generally be the output of another tool”—either out of optimism, or you have just built your own custom tools to supplement your design systems—then a DTCG JSON with all-static values is all you need.

To show my cards, I do fall in the “DTCG should be static values” camp as someone that works with OpenAPI/JSONSchema a lot and sees the JSON format’s strengths in describing static data (and am just used to JSON being the output of another system, generally).

And lastly, the dichotomy is not “are functions/formula needed for a design system” because I think the answer is a universal, resounding YES from everyone. It’s more a matter of whether they fall outside the boundary of the DTCG spec or not (i.e. is DTCG JSON acting as the generator? or the generatee of something else?).

Would love additional thoughts on whether a) outlining this principle as part of the DTCG goals is worthwhile, and b) any additional criteria/considerations/input on navigating the two approaches!

Might have some bias as the person behind the modern Token Operations approach.

I'm not sure if you've carefully crafted the term to describe non-static as "formulae" but I think it might potentially leave out a feature of the spec; aliasing.

In a purely static ecosystem, a system which reads tokens can always traverse to a token and knows once it gets there that there is no further work to do. The value there is precisely sent upstream with no further work.

In an ecosystem where the thing at the end of the traversal says "there's more work to do" causes the system to have some level of dynamism; even if trivial. I'd argue aliasing a token is the most simple form of computing a token.

So in my mind, if we want to allow for aliasing, then we are opening the door for other types of dynamic value generation. Meanwhile, if we are interested in static representations, that would mean that once the traversal is done to a particular token there is no more "work" to be done by token distribution tools; WYSIWYG.

I've spoken to @kaelig about Token Operations before, and he is admittedly not convinced that formulae is appropriate in the spec. It is truly a matter of where the responsibility lies. The question is, are we aiming to define how people are meant to curate a token file OR are we aiming to define how systems are meant to read the file? If this is system based and we don't expect people to be writing these files, then I believe a static file is more appropriate. However, if we believe people will be authoring these files, formulaic niceties like aliasing and others will improve the DX of authoring.

I came to Token Operations with the assumption that people would be in these files in lieu of waiting for tools to support the curation process. The concept of Token Operations (or other formulae) can still exist outside of the spec in order to make the process of curating thousands of token values more manageable. I find value not just in standardizing the format of the file, but also standardizing how dynamic values could be resolved. I do recognize that this value might be out of scope for the specification.

It is truly a matter of where the responsibility lies. The question is, are we aiming to define how people are meant to curate a token file OR are we aiming to define how systems are meant to read the file? If this is system based and we don't expect people to be writing these files, then I believe a static file is more appropriate. However, if we believe people will be authoring these files, formulaic niceties like aliasing and others will improve the DX of authoring.

100% agreed! And yes I think we’re thinking about it the same way—where in the process does DTCG fit? If earlier, non-static, if later, static.

I'd argue aliasing a token is the most simple form of computing a token.

I see what you mean. But I’ve still included aliasing as a given for static (or non-static) because it does require “work” as you said, but the work is trivial enough to be negligible. Also borrowing from prior art of JSONSchema, OpenAPI, YAML—all have the same concept of aliasing and are largely static by design.

But I think that’s the cutoff—beyond that, operations jump into more complex territory. “Static analysis” is a bit of a slippery term when you really peer into it, taking TypeScript as an example. You actually can start to do simple “operations,” and TypeScript will follow up to a point. But very quickly it will get in too deep (e.g. any loop) and won’t be able to statically-analyze on its own anymore. Where that line is between “static” and “non-static” can blur, but I’d say as a general rule, anything more than a single pass/step is probably non-static (and maybe by that definition, some simpler operations are “static” and that’d be OK! But layers upon layers would not be).

So by that logic, if JSON’s strengths are in describing static data, and we want something more programmatic, I’d probably advocate for outlining a DSL that can describe operations more fluidly, with less indirection. Since that’s basically what’s happening anyway, it would be less-indirect, and ideally more user-friendly. JSON is always good for describing a language’s AST, but not for authoring by hand. And I worry a hybrid approach of a “looks like JSON, operates like a programming language” isn’t a very user-friendly design.

I’d say as a general rule, anything more than a single pass/step is probably non-static (and maybe by that definition, some simpler operations are “static” and that’d be OK! But layers upon layers would not be).

I think it's a tough sell for me personally. But I'm also known to be more polarizing in my decisions. 😈

So by that logic, if JSON’s strengths are in describing static data, and we want something more programmatic, I’d probably advocate for outlining a DSL that can describe operations more fluidly, with less indirection.

Probably more appropriate for the Token Operations discussions specifically but yes, I'm not trying to tout myself as knowing the first thing about language authoring. But I do want to raise that if we do compute tokens that we shouldn't try to define specific computations like what alpha is but instead have a method to describe how to alpha so we aren't waiting for tools to implement alpha. The community is already waiting for the simplest parts of the spec to be finalized and I can only imagine how long it would take to agree on what alpha is meant to do.

If transformations are specified within tokens.json, there will always need to be another specified format that is the result of applying those transformations; it's really important to define the end-of-the-line format (ie, the static, compiled result) first, and then work backwards. What we have so far in the spec does a good job of that, and the sticky questions we're working through (eg theming) are really around what kinds of functional/dynamic behavior we want to introduce and how.

There's no salient place to draw a line. We already dip our toes in the water with aliasing, as you've pointed out. Aliases can be resolved to produce "compiled" tokens file, which according to the spec would be interpreted exactly the same as a the file with aliases. That's the kind of behavior that I'd expect out of any functional additions; it's essentially compression, making the file easier to work with and transfer.

No matter what functional syntax we allow in the file, a functional tokens file should always be compilable to a valid static tokens file, in a tightly specified way (ie there is a "correct" reference implementation). That way, a tokens file can be "unpacked" into a usable static file at runtime, making read operations significantly more efficient than if each one required an interpretation step.

design-tokens / community-group