d3plus / d3plus-text

A smart SVG text box with line wrapping and automatic font size scaling.
MIT License
105 stars 19 forks source link

textSplit should not filter out all emoji unicode chars #107

Open snowyu opened 5 years ago

snowyu commented 5 years ago

Expected Behavior

The default textSplit function should display all emoji unicode characters: "🐉️🧚🏻‍♀️🧚🏻‍♂️". and the Chinsese quotation marks “”.

Current Behavior

All emoji and Chinese quotation marks unicode chars are filtered out.

Maybe It is related to #94 too.

Please use the Unicode Line Breaking Algorithm

Line breaking, also known as word wrapping, is the process of breaking a section of text into lines such that it will fit in the available width of a page, window or other display area. The Unicode Line Breaking Algorithm performs part of this process. Given an input text, it produces a set of positions called "break opportunities" that are appropriate points to begin a new line. The selection of actual line break positions from the set of break opportunities is not covered by the Unicode Line Breaking Algorithm, but is in the domain of higher level software with knowledge of the available width and the display size of the text.

snowyu commented 5 years ago

Workaround:

import {default as LineBreaker} from "@craigmorton/linebreak";
import { textWrap } from 'd3plus-text';

const wrapper = textWrap
  .split(splitStr)
  .width(...);

function splitStr(sentence) {
  const breaker = new LineBreaker(sentence);
  const result = [];
  let bk;
  let lastPos = 0;
  // eslint-disable-next-line no-cond-assign
  while (bk = breaker.nextBreak()) {
    const word = sentence.slice(lastPos, bk.position);
    lastPos = bk.position;
    result.push(word);
  }
  return result;
}