design-tokens / community-group

This is the official DTCG repository for the design tokens specification.
https://tr.designtokens.org
Other
1.56k stars 63 forks source link

Suggestion: colorList type for DataViz #228

Open drwpow opened 1 year ago

drwpow commented 1 year ago

Proposal

I’d love to see a new composite type that essentially groups an array of colors that can be useful for categorical data visualization, e.g.:

{
  "color": {
    "category10": {
      "$type": "colorList",
      "$value": [
        { "$value": "#1f77b4" },
        { "$value": "#ff7f0e" },
        { "$value": "#2ca02c" },
        { "$value": "#d62728" },
        { "$value": "#9467bd" },
        { "$value": "#8c564b" },
        { "$value": "#e377c2" },
        { "$value": "#7f7f7f" },
        { "$value": "#bcbd22" },
        { "$value": "#17becf" }
      ]
    }
  }
}

Example: all of D3’s categorical schemes are colorLists

CleanShot 2023-08-28 at 10 41 46@2x

Definition

The defining factors for a $type: "colorList" are:

  1. I need an ordered list of colors where the ordering matters (e.g. one color is “first”, one is “second”, and the ordering should be preserved)
  2. I also will not ever pull one and only one color from the palette (e.g. I won’t ever grab the third color; I will always grab the entire set and loop through them in order)
  3. The collection does need to be a token itself as the palette as a whole does not work without all the colors
  4. However, the values DO NOT necessarily relate to one another, and colors could be added/removed/reordered without being a “breaking change” (i.e. they are not a color ramp)
  5. Further, these colors DO NOT form a gradient as you can’t interpolate between them (sidenote: data visualization does have the idea of “gradient” palettes like Turbo but that’s a different usecase entirely, and often those visualizations need to calculate intermediary values which suggest they are really just $type: "gradient" tokens)
  6. Lastly, using a traditional token group (an object with keys) isn’t sufficient because there’s no standard way to preserve ordering (colors need to be pulled in a specific, deterministic order) and length (I need to know how many colors are in this palette, which isn’t possible in a group as there may be subgroups)

The main usecase is for categorical data visualization such as colors for bar graphs, line charts, pie charts, sankey diagrams, and the dozens of other chart types that rely on categorical coloring.

I believe data visualization palettes should be part of a design system and therefore be tokenized. The New York Times, FiveThirtyEight, the Washington Post, and others all have well-established principles of dataviz that are central to their design system. It makes sense that this should be preserved in a token specification for this.

Naming

I do not care about the name colorList. I was only trying to avoid using the term “group” as that has other meanings in this spec. “Palette” is also too generic, and could refer to any arbitrary grouping. “List,” however, is a generic computing term for an ordered collection of items. “Array” would work, but I feel that’s less natural to say, and often doesn’t mean anything to non-programmers.

Open to suggestions / alternate names here.

Distinction

This is probably not a good fit for color ramps as those colors are usually pulled individually, and follow specific naming patterns unique to that design system (e.g. you’d want to refer to color.blue.900 and not “the seventh color in this array list”).

This should also not be used for gradient-like color lists. Whereas gradients allow implicit interpolation between the stops, a colorList allows NO interpolation between adjacent colors. Each color is specifically designed to be unique, and does not bear a clear relationship with its sibling colors (other than being visually distinct).

Alternative Approaches

A “hack” that is workable today is having recommended way to set groups as arrays, e.g.:

{
  "color": {
    "category10": {
      "0": { "$type": "color", "$value": "#1f77b4" },
      "1": { "$type": "color", "$value": "#ff7f0e" },
      …
    }
  }
}

However I would not be a fan of this approach because:

Obligatory Caveat

Apologies if this discussion has happened in another thread, or if someone has suggested something different / conflicting. Open to feedback / combining with another ongoing discussion I may have missed

TravisSpomer commented 1 year ago

❌ At least in JavaScript-land, objects are not meant to preserve ordering (0 appearing before 1 is never guaranteed, and it would have to be on the tool itself to manually sort each time, which isn’t clear from the spec it should do).

I don't have any feedback on the dataviz scenarios themselves, but JavaScript does explicitly define iteration order now and it's supported everywhere. Strings that contain integers in numerical order, then all other strings in insertion order, then Symbols.

ddamato commented 1 year ago

I'm partial to the simplicity of the current specification, however, this made me consider something beyond what you are asking. Why not just allow the $value of any type to allow a "list" (an Array)?

{
  "dataviz": {
    "$type": "color",
    "$value": [ ... ]
  },
  "typography-scale": {
    "$type": "dimension",
    "$value": [ ... ]
  },
}

A difficulty here is that there's often a separation between token author and token user. While the token author knows that this kind of token is expected as a list (accessible by index), the token user might not know as they usually only apply tokens by name. Having an additional accessor past the dot-notation for some tokens may be easily missed in practice for these "special" tokens. Thereby not applying a value at all. I know the purpose of these tokens isn't for someone to apply individually, but programmatically. However, I don't know if that's enough of a case for them to be different from the rest of the spec.

Also requiring to know how many items exist is a challenge, since some lists could have different lengths. This is especially in the case of the D3 example. In a CSS rendered world, where tokens are often finally written as var, the dataviz component would have no idea how many of them are available (is it 4, or 20?) between themes unless you have some socially accepted guard (read on).

I think the reasons above make it challenging for any list type to exist.[^1] In the system we use today, we've explicitly set out for 30 colors across each theme which must all be defined. If they need to loop (meaning one theme only has 6 colors, they are applied 5 times), then so be it. Not listed as an array but as separate names (like all other tokens) such as dataviz.order7. In the dataviz component, we know 30 is the max and cyclically render by name for each chart part.

Fundamentally, each token is meant to represent the eventuality of a single value. It's why the "$type": "shadow" token is a composite, all of the parts are meant to combine into a single final value. The idea of a list is the opposite; a single token meant to provide multiple values. Full disclosure, I also don't enjoy composite tokens as a concept due to a lack of simplicity.

I'll mention that this family of colors is not only helpful for charting, but any place where color is meant as a differentiator between elements of the same group (like avatar defaults). I've called them "figure" colors in the past, meant to color shapes. This is the most challenging set to name because it's not possible to give them semantic meaning past their group. These are effectively base colors that expect to be applied directly in the UI. I've experimented with going so far as coloring illustrations this way (ie., color-by-number) which takes an extra amount of focus and coordination between themes to get right.

[^1]: I know fontFamily can be a list but it's strategically not like any other value in the way it requires additional resources to be useful. The token user placing the token would just render the resulting string, while systems must read as individual resources to load.

c1rrus commented 1 year ago

Interesting discussion. I can certainly see the value in being able to describe a list of colors. Furthermore, if we were to go down that route I do like the idea of generalizing it so that you could have lists of any kind of token.

However, a problem with @ddamato's suggestion of just allowing any $value to be an array is that for some types a singular token value might already a JSON array. For example fontFamily tokens may have array values (which are meant to be intepreted as a font stack from most to least preferred font family name). Similarly, gradient token values are arrays of objects representing the color stops.

So, how would a tool reliably differentiate between an ordered set of individual font family names, versus a single font stack? Perhaps not a very real-life example, but hopefully you get the idea.

At first, I was thinking an alternative might be to introduce a new $list (or perhaps $valueList, or just $values?) property which tokens can have instead of a $value. Its value would always be a JSON array of valid token values for that type. For example:

{
  "color": {
    "category10": {
      "$type": "color",
      "$list": [
        "#1f77b4",
        "#ff7f0e",
        "#2ca02c",
        "#d62728",
        "#9467bd",
        "#8c564b",
        "#e377c2",
        "{reference.to.some.other.color.token.why.not}",
        "#bcbd22",
        "#17becf"
      ]
    }
  }
}

But, an immediate issue with that is that it quickly breaks down when you consider what happens if this list token is (accidentally?) referenced from another token which is expecting a singular value.

{
  "color": {
    "category10": {
      "$type": "color",
      "$list": [
        "#1f77b4",
        "#ff7f0e",
        "#2ca02c",
        "#d62728",
        "#9467bd",
        "#8c564b",
        "#e377c2",
        "{reference.to.some.other.color.token.why.not}",
        "#bcbd22",
        "#17becf"
      ]
    }
  },
  "gradient": {
    "rainbow": {
      "$type": "gradient",
      "$value": [
        {
          "position": 0,
          "color": "#000000"
        },
        {
          "position": 1,
          "color": "{color.category10}" // It's a color token, Jim, but not as we know it
        }
      ]
    }
  }
}

Is that an error? Is there some convention like tools should just pluck out the first value from the array? Neither of those feels particularly intuitive.

Maybe I'm overthinking things and we should just add a new colorList type along the lines outlined in the OP. There's a clear use-case for it (the DS where I work also has categorical palettes, so I can see the value in it). If a similar need arises for other things - e.g. someone needs a space list or whatever - then we add new types as needed.

ddamato commented 1 year ago

However, a problem with @ddamato's suggestion of just allowing any $value to be an array is that for some types a singular token value might already a JSON array. For example fontFamily tokens may have array values (which are meant to be intepreted as a font stack from most to least preferred font family name). Similarly, gradient token values are arrays of objects representing the color stops.

For clarity, my post isn't a suggestion but a thought experiment that eventually lands on the reason why I don't think these should be arrays; the accessor to get to these items would be variable making them difficult to target, especially across themes which might have different lists. In my mind, it's easy to ensure the same token names exists across files, thereby ensuring that a token name can be used always. It'll be much harder when the accessor is potentially varied across files.

// theme1.json
tokens.colors.red = #ff0000
tokens.colors.dataviz[5] = #271f24

// theme2.json
tokens.colors.red = #550000
tokens.colors.dataviz[5] = undefined
drwpow commented 12 months ago

After chewing on the question “should any value accept an array” I originally wanted the answer to be “yes” but now I’m not convinced it’s possible. In some tokens, order is significant (font family fallbacks, color ramps); in other tokens it wouldn’t be (shadow arrays likely could be combined in any order). Further, some tokens wouldn’t have a clear use (what would an array of cubic béziers even be used for)? Different concerns from how different token types are used wildly differently would probably warrant new token types altogether with add’l metadata on how each is used.

The design of groups is flexible enough that most of the time a group is just fine + a short description on the group on how it’s organized. Maybe even colorList not a necessary token type, and is just a recommended pattern on how a group could be used (that’s how I’m getting by with it now!). But again, I’m also slightly-against the “what if any token accepted an array of values” thought experiment as well.

ilikescience commented 11 months ago

Big +1 to having this use case be solved by token groups, not by a new type. And agreed with the suggestion to avoid arrays where we can; I think in some token types it's going to be useful to allow arrays, but generally key/value pairs are going to be much much better.