wilkelab / ggtext

Improved text rendering support for ggplot2
https://wilkelab.org/ggtext/
GNU General Public License v2.0
650 stars 37 forks source link

Markdown elements are broken up by word in SVG output #95

Open jimjam-slam opened 1 year ago

jimjam-slam commented 1 year ago

This might be out of scope since it's specifically an interaction with {svglite} (and ofc because development is frozen on this), but I'll pop it in anyway for documentation purposes!

When element_markdown is applied to a text element, that element is broken up word-by-word when saved with {svglite}, even if there aren't any changes in formatting within the element.

That's not a problem visually, but it does make it more difficult to edit the SVG in programs like Illustrator, since you essentially have to delete and replace the text element.

Here's a reprex with a subtitle that is rendered as (1) a plain text, and (2) a rich text grob:

library(ggplot2)
library(svglite)
library(ggtext)

p1 <-
  ggplot(mtcars) +
  aes(x = mpg, y = hp) +
  geom_point() +
  theme_minimal() +
  labs(
    x = "X axis", y = "Y axis",
    title = "This is a test!",
    subtitle = paste(
      "Longer subtitles can help people understand the take-home message,",
      "even if they aren't able to understand the rest of the plot.",
      sep = "\n"))

p2 <- p1 + theme(plot.subtitle = element_markdown(face = "plain"))

ggsave("test1.svg", p1)
ggsave("test2.svg", p2)

The subtitle in test1.svg is:

<text x='38.16' y='31.23' style='font-size: 11.00px; font-family: Arial;' textLength='340.65px' lengthAdjust='spacingAndGlyphs'>Longer subtitles can help people understand the take-home message,</text>
<text x='38.16' y='43.11' style='font-size: 11.00px; font-family: Arial;' textLength='274.82px' lengthAdjust='spacingAndGlyphs'>even if they aren't able to understand the rest of the plot.</text>

While in test2.svg it's:

<text x='38.16' y='31.23' style='font-size: 11.00px; font-family: Arial;' textLength='34.26px' lengthAdjust='spacingAndGlyphs'>Longer</text>
<text x='75.48' y='31.23' style='font-size: 11.00px; font-family: Arial;' textLength='40.35px' lengthAdjust='spacingAndGlyphs'>subtitles</text>
<text x='118.88' y='31.23' style='font-size: 11.00px; font-family: Arial;' textLength='17.74px' lengthAdjust='spacingAndGlyphs'>can</text>
<text x='139.68' y='31.23' style='font-size: 11.00px; font-family: Arial;' textLength='20.80px' lengthAdjust='spacingAndGlyphs'>help</text>
<text x='163.53' y='31.23' style='font-size: 11.00px; font-family: Arial;' textLength='33.04px' lengthAdjust='spacingAndGlyphs'>people</text>
<text x='199.62' y='31.23' style='font-size: 11.00px; font-family: Arial;' textLength='55.06px' lengthAdjust='spacingAndGlyphs'>understand</text>
<text x='257.73' y='31.23' style='font-size: 11.00px; font-family: Arial;' textLength='15.29px' lengthAdjust='spacingAndGlyphs'>the</text>
<text x='276.08' y='31.23' style='font-size: 11.00px; font-family: Arial;' textLength='51.98px' lengthAdjust='spacingAndGlyphs'>take-home</text>
<text x='331.12' y='31.23' style='font-size: 11.00px; font-family: Arial;' textLength='47.70px' lengthAdjust='spacingAndGlyphs'>message,</text>
<text x='381.87' y='31.23' style='font-size: 11.00px; font-family: Arial;' textLength='23.86px' lengthAdjust='spacingAndGlyphs'>even</text>
<text x='408.78' y='31.23' style='font-size: 11.00px; font-family: Arial;' textLength='5.49px' lengthAdjust='spacingAndGlyphs'>if</text>
<text x='417.33' y='31.23' style='font-size: 11.00px; font-family: Arial;' textLength='20.79px' lengthAdjust='spacingAndGlyphs'>they</text>
<text x='441.18' y='31.23' style='font-size: 11.00px; font-family: Arial;' textLength='27.18px' lengthAdjust='spacingAndGlyphs'>aren't</text>
<text x='471.41' y='31.23' style='font-size: 11.00px; font-family: Arial;' textLength='20.80px' lengthAdjust='spacingAndGlyphs'>able</text>
<text x='495.26' y='31.23' style='font-size: 11.00px; font-family: Arial;' textLength='9.17px' lengthAdjust='spacingAndGlyphs'>to</text>
<text x='507.49' y='31.23' style='font-size: 11.00px; font-family: Arial;' textLength='55.06px' lengthAdjust='spacingAndGlyphs'>understand</text>
<text x='565.60' y='31.23' style='font-size: 11.00px; font-family: Arial;' textLength='15.29px' lengthAdjust='spacingAndGlyphs'>the</text>
<text x='583.95' y='31.23' style='font-size: 11.00px; font-family: Arial;' textLength='18.34px' lengthAdjust='spacingAndGlyphs'>rest</text>
<text x='605.34' y='31.23' style='font-size: 11.00px; font-family: Arial;' textLength='9.17px' lengthAdjust='spacingAndGlyphs'>of</text>
<text x='617.56' y='31.23' style='font-size: 11.00px; font-family: Arial;' textLength='15.29px' lengthAdjust='spacingAndGlyphs'>the</text>
<text x='635.91' y='31.23' style='font-size: 11.00px; font-family: Arial;' textLength='20.79px' lengthAdjust='spacingAndGlyphs'>plot.</text>

(Apologies for the long block; <details> doesn't contain this well!)

mcanouil commented 1 year ago

I've seen it for a long time and the worst part of this, is that it messes up the text/spacing alignment. Edit: I think the issue is related to https://github.com/wilkelab/gridtext/issues/24, i.e., specifying the font face messing things up.

bwiernik commented 1 year ago

Can someone confirm that this is still an issue in the current CRAN versions of ggtext and gridtext?

mcanouil commented 1 year ago

I just tried a code of mine, where I set a bold font face, now with gridtext 0.1.5 and ggtext 0.1.2 (I did not even realise there were updates since no tags/releases on the GitHub repository) => no funky things! 👍

clauswilke commented 1 year ago

Yes, what this issue is about, breaking strings of text into words, is still the case and is not related to the recent R 4.2.0 bug. It is due to the algorithm gridtext uses to layout text. It does it word by word. When I wrote the package this was basically the only way I could get this done.

A better algorithm would assemble strings letter by letter, calculating font metrics along the way. This would require the entire layouting code to be rewritten. It also would have to be written in C or C++ to get acceptable speed. This is not on my todo list at this time. At some point I experimented with a layouting engine in Rust, and I got this part to work, but again that should be seen as an experiment and not something I'm currently pushing forward. If anybody wants to take this one I'm happy to provide pointers.

jimjam-slam commented 1 year ago

I should clarify that my ideal scenario for an SVG output for my own use would be a single <text> element that uses <tspan> for any markdown formatting. For example (I've omitted the textLength and lengthAdjust attributes in this example, although they're relevent):

Longer subtitles can help people understand the take-home message, even if they aren't able to understand the rest of the plot.

<text x='38.16' y='31.23' style='font-size: 11.00px; font-family: Arial;' lengthAdjust='spacingAndGlyphs'>
  Longer subtitles can help people <tspan style="font-weight: bold;">understand the take-home message,</tspan> even if they aren't able to understand the rest of the plot.
</text>

I'm not sure how feasible this is if gridtext is splitting words up into separate text grobs (I imagine you would need {svglite} to interpret a gTree or similar full of text grobs a certain way to handle the recombination). Is that a fair summary?

I'm not sure I can contribute substantial work right now, but it's something we might be able to contribute effort to next year depending on how things go (I'm thinking ahead a bit on our own needs in filing this!). But it sounds like other folks might be more concerned about the alignment issues than the word splitting itself.