opentypejs / opentype.js

Read and write OpenType fonts using JavaScript.
https://opentype.js.org/
MIT License
4.37k stars 469 forks source link

Ligatures performance #286

Open axkibe opened 7 years ago

axkibe commented 7 years ago

As I've been working through the ligatures code, I noticed it recreates the ligatures collection on each and every call to stringToGlyphs()

Font.prototype.stringToGlyphs = function(s, options) {

....

    if (options.features) {
        var script = options.script || this.substitution.getDefaultScriptName();
        var manyToOne = [];
        if (options.features.liga) manyToOne = manyToOne.concat(this.substitution.getFeature('liga', script, options.language));
        if (options.features.rlig) manyToOne = manyToOne.concat(this.substitution.getFeature('rlig', script, options.language));
        for (i = 0; i < length; i += 1) {
            for (var j = 0; j < manyToOne.length; j++) {

as for a given language/script these created objects doesn't change, this ought to be created only once. It should not only safe the time to create these object but also remove the strain from the garbage collector.

PS: I also noted it makes a linear search over all enabled ligatures which also IMO should either be lookup tables or a binary search.

PPS: Also options.language and options.script are undocumented features used to enable certain ligatures, for example 'ffi' in DejaVuSans-Regular is only active when options.script is set to 'latn' as it isn't enable in 'DFLT' as this is a multi-language font.

I'm currently a little confused what the difference between options.language and options.script is, so I don't see myself fit to be the one to expand the README.

fpirsch commented 7 years ago

Hi @axkibe,

unfortunately the list of ligatures can change quite a lot, depending on the script, language or features selected. Maybe this list could be memoized, but checking the options to retrieve the right memoized set would not necessarily be cheaper. Memoization consumes memory, too.

The search is not optimized at all. The manyToOne list could be sorted to allow binary-search on the 1st glyph. At the time I wrote this, the list contained at most 10 ligatures on my test fonts, so there was no big win here. But I agree that sooner or later optimizing this will become necessary.

options.script is the font-defined script name, e.g. "DFLT" or "latn" options.language is the font-defined language name, e.g. "ENG " or "DEU ". (see Language system tags) The font.tables.gsub.scripts object gives you an overview of the available tags for a specific font.

axkibe commented 7 years ago

As I assume normal use* of stringToGlyphs() refers to a lot of calls with the same script/language/features after each other, I suppose caching the last result would be reasonable an recalculate it, whenever something new arrives, no need to cache various combos tough.

'* At least somebody that sets text. I suppose for font editors its also more or less the same to have more calls to stringToGlyphs with same features set. Would be interested in other use scenarios of opentype.js.

axkibe commented 7 years ago

Suggestion: How about we split the function into two?

one calculateOptionsSet() -- looking for a better name. and stringToGlyphs() as it is.

The API user is free to use either stringToGlyphs as it is, with calculating stuff on the fly, or provide instead of the plain 'options' object a precalculated one... and it's up to the API user to cache it.

yne commented 2 years ago

@axkibe can this issue be closed ?