Fix for long axis label text wrap

jhofman commented 2 years ago

Sometimes we end up with long auto-generated y-axis labels, for instance mean(total_bill_length_mm).

When this exceeds the length of the side of a hacked facet we get weird overlaps and repetitions.

Let's think about how to solve this. SVG text wrapping isn't particularly easy, but there are solutions. We may need to rewrite the axis label with some whitespace, though. Otherwise we'll either get no break or a weird break, like mean(total_bill\n_length_mm).

One option would be to write mean of total_bill_length_mm and another would be to replace underscores with spaces or something like that.

willdebras commented 2 years ago

Couple potential options:

We add a character cutoff with overflow handling i.e. mean(total_bill_length_mm) could end up rendered as mean(total_bill ...
We use foreignObject or just split the text across multiple tspan elements.

Either way we could have some synergistic handling here, where we process these strings on the R side.

I could build it into the R backend to split labels into an array in the specs @giorgi-ghviniashvili receives, then he can just iteratively render them as tspan elements.

I think with some basic string manipulation we can prioritize splitting on words at a certain character count, but if a word passes that character count it would get split mid-word. This would basically mimick word breaking in a text editor/processor, but could be preprocessed into an array for the frontend.

We can discuss next week when we are all on call.

giorgi-ghviniashvili commented 2 years ago

@willdebras when it comes to wrapping foreignObject always better idea than tspan. We can try foreignObject first. Hopefully gemini will successfully animate that, but if not then we need to use tspans. For next step, let's drop an example with bigger labels and then I will try to implement foreignObject. We don't need an array of texts for foreignObject, but we will probably need it for tspan

willdebras commented 2 years ago

Makes sense. Here is an example of specs with labeling cutoffs:

https://github.com/microsoft/datamations/blob/custom_animations/sandbox/labeling/long-label-specs-R.json

giorgi-ghviniashvili commented 2 years ago

@willdebras @jhofman good news, we don't need to do any special logic to wrap things, nor using foreignObject. Vega-lite itself handles that. If you need multiple lines, we should pass title as an array in the vega spec:

Which automatically wraps it like this:

So @willdebras , could you please send title as an array of strings?

willdebras commented 2 years ago

This is great! I'll build in the handling and send some test specs later today for arrayed titles for your testing tomorrow.

willdebras commented 2 years ago

The handling I am currently building in will split variable names on multiple conditions:

certain delimiters, i.e. underscores which typically are used in variable names (maybe also periods though they are unconventional in R?)
character length, i.e. if a non delimited string is simply too long it will split

Does this seem sensible? For example, something like

extra_long_reallylongsingleword might get delimited to an array that looks like ['extra', 'long', 'reallylongsingl', 'eword']

Each term in the list has a max character count of 15 and it will split on those delimiters then check that no term is still too long and split accordingly. It also has some functionality to ensure the trailing letters split aren't too short as to look goofy.

I will send specs a little later today, but wanted to document the basic approach I am taking.

willdebras commented 2 years ago

https://github.com/microsoft/datamations/blob/title-arrays/sandbox/labeling/long-label-specs-split-R.json

Here is a set of specs with the y title passed as an array in that summarize step:

        "y": {
          "field": "datamations_y",
          "type": "quantitative",
          "title": ["median", "of", "long", "extralongvariab", "lename", "that", "is", "long"],
          "scale": {
            "domain": [36.8, 50.95]
          }
        },

I'll implement it in the groupby as well shortly, but should let you test this out.

giorgi-ghviniashvili commented 2 years ago

@willdebras it looks like this:

I think it is better to combine "of" and "long" and other small words together.

willdebras commented 2 years ago

Ah I see. It wraps on every item or the array. Okay, I will adjust the approach today so these shorter terms are combined into single strings.

jhofman commented 2 years ago

@willdebras can you merge this as well?

microsoft / datamations

Fix for long axis label text wrap #139