vercel / satori

Enlightened library to convert HTML and CSS to SVG
https://og-playground.vercel.app
Mozilla Public License 2.0
10.83k stars 238 forks source link

Update Emojis to include Unicode 15.0+ #621

Open anaclumos opened 3 months ago

anaclumos commented 3 months ago

Bug report

Description / Observed Behavior

What kind of issues did you encounter with Satori?

It doesn't render Unicode 15.0 emojis, such as 🪈

Vizards commented 2 months ago

I've made a investigation for this issue and found it seems to be a bug from linebreak not only because the default Emoji Providers does not yet support all emojis of Unicode 15.

I created a simple playground to demonstrate this more clearly:

Playground Preview

I found that the default wordBreak logic in src/utils.ts#L285 causes the Emoji ZWJ Sequence to be incorrectly recognized:

  if (wordBreak === 'break-all') {
    return { words: segment(content, 'grapheme'), requiredBreaks: [] }
  }

  if (wordBreak === 'keep-all') {
    return { words: segment(content, 'word'), requiredBreaks: [] }
  }

  const breaker = new LineBreaker(content)

Only when wordBreak === 'break-all' or wordBreak === 'keep-all' is specified, Intl.Segmenter will be called to handle text segmentation. When wordBreak is not specified, linebreak is called to handle. And linebreak currently supports Unicode version 13. It splits 🫸🏽 to ['🫸', '🏽'] that Satori couldn’t render the emoji correctly.

A probably workaround, hope this helps those experiencing similar issues:

  1. Specify the style wordBreak: 'break-all' or wordBreak: 'keep-all' on the text container that needs to display Unicode 13+ Emoji ZWJ Sequence
  2. Customize loadAdditionalAsset or graphemeImages (The Emoji Providers in the Playground do not support 🪈 or 🫸🏽)

But when wordBreak is not specified, satori cannot correctly segment the emoji (🫸🏽) in the example. Wondering if there is consideration to replace the default wordBreak with Intl.Segmenter for text segmentation? I'm willing to help with further investigation if needed. @shuding