harfbuzz / harfbuzzjs

Providing HarfBuzz shaping library for client/server side JavaScript projects
https://harfbuzz.github.io/harfbuzzjs/
Other
197 stars 34 forks source link

optimize setLanguage, setScript, shapeWithTrace #82

Closed chearon closed 1 year ago

chearon commented 1 year ago

I'm seeing createCString as a dominator in a project I'm working on. Nodejs text encoding/decoding is slow: nodejs/node#39879. Browsers don't show much difference with this change.

Here's how I profiled it in harfbuzzjs:

diff --cc examples/hbjs.example.js
index b7718e9,b7718e9..75b4c3a
--- a/examples/hbjs.example.js
+++ b/examples/hbjs.example.js
@@@ -7,8 -7,8 +7,15 @@@ function example(hb, fontBlob, text) 
    font.setScale(1000, 1000); // Optional, if not given will be in font upem

    var buffer = hb.createBuffer();
++  console.time('c strings');
++  for (let i = 0; i < 1000; i++) {
++    buffer.setLanguage('en');
++    buffer.setScript('Latn');
++  }
++  console.timeEnd('c strings');
    buffer.addText(text || 'abc');
--  buffer.guessSegmentProperties();
++  buffer.setDirection('ltr');
++  buffer.setClusterLevel(1);
    // buffer.setDirection('ltr'); // optional as can be set by guessSegmentProperties also
    hb.shape(font, buffer); // features are not supported yet
    var result = buffer.json(font);
chearon commented 1 year ago

By the way, I also profiled harfbuzzjs against canvas's measureText in Chrome and Firefox yesterday. I was surprised at how well it performed. Unsurprisingly measureText is faster, but doing word wrapping that way necessarily involves measuring words individually. When you compare that to shaping whole paragraphs in harfbuzzjs, it is often only slower by a few milliseconds before the browser's shape cache kicks in (I measured ~10 paragraphs).

I've also experimented with shaping word-by-word like Firefox does, but that's too many shape() calls. Calling into WASM incurs a cost. I think if whole paragraphs were shaped until enough data is collected, then switch to shaping by word, harfbuzzjs becomes a serious choice to layout text in a canvas.

khaledhosny commented 1 year ago

Thanks, the use of TextEncoder here always felt like an overkill.