yeslogic / allsorts

Font parser, shaping engine, and subsetter implemented in Rust
https://yeslogic.com/blog/allsorts-rust-font-shaping-engine/
Apache License 2.0
706 stars 23 forks source link

allsorts seems to ignore shaping rules on the Latin range #44

Closed ctrlcctrlv closed 3 years ago

ctrlcctrlv commented 3 years ago

Comparison of hb-shape and allsorts against my font, FRB American Cursive:

allsorts

target/debug/allsorts shape -f ~/Workspace/FRBAmericanCursive/dist/FRBAmericanCursive-400-Regular.otf -s dflt -l DFLT "Shape me!"
glyphs: [
    Info {
        glyph: RawGlyph {
            unicodes: ['S'],
            glyph_index: 55,
            liga_component_pos: 0,
            glyph_origin: Direct,
            small_caps: false,
            multi_subst_dup: false,
            is_vert_alt: false,
            fake_bold: false,
            fake_italic: false,
            variation: Some(
                VS15,
            ),
            extra_data: (),
        },
        kerning: 0,
        placement: None,
        mark_placement: None,
        is_mark: false,
    },
    Info {
        glyph: RawGlyph {
            unicodes: ['h'],
            glyph_index: 314,
            liga_component_pos: 0,
            glyph_origin: Direct,
            small_caps: false,
            multi_subst_dup: false,
            is_vert_alt: false,
            fake_bold: false,
            fake_italic: false,
            variation: Some(
                VS15,
            ),
            extra_data: (),
        },
        kerning: 0,
        placement: None,
        mark_placement: None,
        is_mark: false,
    },
    Info {
        glyph: RawGlyph {
            unicodes: ['a'],
            glyph_index: 79,
            liga_component_pos: 0,
            glyph_origin: Direct,
            small_caps: false,
            multi_subst_dup: false,
            is_vert_alt: false,
            fake_bold: false,
            fake_italic: false,
            variation: Some(
                VS15,
            ),
            extra_data: (),
        },
        kerning: 0,
        placement: None,
        mark_placement: None,
        is_mark: false,
    },
    Info {
        glyph: RawGlyph {
            unicodes: ['p'],
            glyph_index: 356,
            liga_component_pos: 0,
            glyph_origin: Direct,
            small_caps: false,
            multi_subst_dup: false,
            is_vert_alt: false,
            fake_bold: false,
            fake_italic: false,
            variation: Some(
                VS15,
            ),
            extra_data: (),
        },
        kerning: 0,
        placement: None,
        mark_placement: None,
        is_mark: false,
    },
    Info {
        glyph: RawGlyph {
            unicodes: ['e'],
            glyph_index: 285,
            liga_component_pos: 0,
            glyph_origin: Direct,
            small_caps: false,
            multi_subst_dup: false,
            is_vert_alt: false,
            fake_bold: false,
            fake_italic: false,
            variation: Some(
                VS15,
            ),
            extra_data: (),
        },
        kerning: 0,
        placement: None,
        mark_placement: None,
        is_mark: false,
    },
    Info {
        glyph: RawGlyph {
            unicodes: ['e'],
            glyph_index: 393,
            liga_component_pos: 0,
            glyph_origin: Direct,
            small_caps: false,
            multi_subst_dup: true,
            is_vert_alt: false,
            fake_bold: false,
            fake_italic: false,
            variation: Some(
                VS15,
            ),
            extra_data: (),
        },
        kerning: 0,
        placement: None,
        mark_placement: None,
        is_mark: false,
    },
    Info {
        glyph: RawGlyph {
            unicodes: [' '],
            glyph_index: 386,
            liga_component_pos: 0,
            glyph_origin: Char(
                ' ',
            ),
            small_caps: false,
            multi_subst_dup: false,
            is_vert_alt: false,
            fake_bold: false,
            fake_italic: false,
            variation: Some(
                VS15,
            ),
            extra_data: (),
        },
        kerning: 0,
        placement: None,
        mark_placement: None,
        is_mark: false,
    },
    Info {
        glyph: RawGlyph {
            unicodes: ['m'],
            glyph_index: 331,
            liga_component_pos: 0,
            glyph_origin: Char(
                'm',
            ),
            small_caps: false,
            multi_subst_dup: false,
            is_vert_alt: false,
            fake_bold: false,
            fake_italic: false,
            variation: Some(
                VS15,
            ),
            extra_data: (),
        },
        kerning: 0,
        placement: None,
        mark_placement: None,
        is_mark: false,
    },
    Info {
        glyph: RawGlyph {
            unicodes: ['e'],
            glyph_index: 285,
            liga_component_pos: 0,
            glyph_origin: Direct,
            small_caps: false,
            multi_subst_dup: false,
            is_vert_alt: false,
            fake_bold: false,
            fake_italic: false,
            variation: Some(
                VS15,
            ),
            extra_data: (),
        },
        kerning: 0,
        placement: None,
        mark_placement: None,
        is_mark: false,
    },
    Info {
        glyph: RawGlyph {
            unicodes: ['e'],
            glyph_index: 393,
            liga_component_pos: 0,
            glyph_origin: Direct,
            small_caps: false,
            multi_subst_dup: true,
            is_vert_alt: false,
            fake_bold: false,
            fake_italic: false,
            variation: Some(
                VS15,
            ),
            extra_data: (),
        },
        kerning: 0,
        placement: None,
        mark_placement: None,
        is_mark: false,
    },
    Info {
        glyph: RawGlyph {
            unicodes: ['!'],
            glyph_index: 294,
            liga_component_pos: 0,
            glyph_origin: Char(
                '!',
            ),
            small_caps: false,
            multi_subst_dup: false,
            is_vert_alt: false,
            fake_bold: false,
            fake_italic: false,
            variation: Some(
                VS15,
            ),
            extra_data: (),
        },
        kerning: 0,
        placement: None,
        mark_placement: None,
        is_mark: false,
    },
]

hb-shape

hb-shape ~/Workspace/FRBAmericanCursive/dist/FRBAmericanCursive-400-Regular.otf "Shape me!"
[S.rhigh=0+689|h.high=1+507|a.low=2+565|p.low=3+658|e.low=4+309|tail.low=4+171|space=5+400|m=6+841|e.low=7+309|tail.low=7+171|exclam=8+276]
wezm commented 3 years ago

Which particular differences did you notice? Different glyphs in the output or positioning information (or something else)?

ctrlcctrlv commented 3 years ago

The positioning wasn't even checked by me to be correct because it's choosing the wrong glyphs.

"Shape me!" (using glyphs [S|h|a|p|e|space|m|e|exclam])

image

"Shape me!" (using GSUB rules w/feature ccmp: [S.rhigh|h.high|a.low|p.low|e.low|tail.low|space|m|e.low|tail.low|exclam])

image

hb-shape command line which will just show you the glyphs:

hb-shape dist/FRBAmericanCursive-400-Regular.otf "Shape me!" --no-positions --ned
ctrlcctrlv commented 3 years ago

The thing about this font is that even a single Latin letter needs GSUB to render correctly, to get the tail.

$ hb-shape dist/FRBAmericanCursive-400-Regular.otf "a" --no-positions --ned
[a|tail.lowwide]
$ hb-shape dist/FRBAmericanCursive-400-Regular.otf "a" --no-positions --ned --features=-ccmp
[a]

Another font of mine that requires GSUB to render correctly is TT2020.

[fred@laptop FRBAmericanCursive]$ hb-shape ../TT2020/dist/TT2020StyleB-Regular.ttf "Shape me!" --no-positions --ned 
[S|h.2|a.3|p.4|e.5|space.6|m.7|e.8|exclam.9]

But the command ~/Workspace/allsorts-tools/target/debug/allsorts shape -f ../TT2020/dist/TT2020StyleB-Regular.ttf -s DFLT -l dflt "Shape me!" results in similarly bad output.

wezm commented 3 years ago

Thanks for the info. I'll look into it. I'm pretty confident that it will work one way or another. It's possibly a limitation of the allsorts binary as opposed to the library. If I feed this document into Prince (which uses Allsorts for font parsing and shaping) I get the right output.

<html>
  <head>
    <style>
    @font-face {
        font-family: "FRB American Cursive";
        font-weight: normal;
        font-style: normal;
        font-stretch: normal;
        src: url("/home/wmoore/Downloads/FRBAmericanCursive/dist/FRBAmericanCursive-400-Regular.otf")
    }
    body {
      font-family: "FRB American Cursive";
      font-size: 48pt;
    }
    </style>
  </head>
  <body>
    <p>Shape me!</p>
  </body>
</html>

Prince output

ctrlcctrlv commented 3 years ago

That is interesting. Could it be failure to fallback when script/lang is DFLT/dflt?

wezm commented 3 years ago

Ok so I looked into it and as far as I can see the Allsorts output matches Harfbuzz, it's just that the output we generate is different to hb-shape.

$ allsorts shape -f ~/Downloads/FRBAmericanCursive/dist/FRBAmericanCursive-400-Regular.otf -s dflt -l DFLT "Shape me" | grep glyph_index
            glyph_index: 55,
            glyph_index: 314,
            glyph_index: 79,
            glyph_index: 356,
            glyph_index: 285,
            glyph_index: 393,
            glyph_index: 386,
            glyph_index: 331,
            glyph_index: 285,
            glyph_index: 393,
$ hb-shape ~/Downloads/FRBAmericanCursive/dist/FRBAmericanCursive-400-Regular.otf "Shape me" --no-glyph-names --no-positions
[55=0|314=1|79=2|356=3|285=4|393=4|386=5|331=6|285=7|393=7]

Comparing these you can see that the same glyphs are present after shaping.

ctrlcctrlv commented 3 years ago

Would you accept a patch to output the glyph name and not just its index?

wezm commented 3 years ago

If the idea is to resolve the glyph name from the glyph id in the output glyphs within allsorts-tools then I think that would be fine. Currently the shaping output is just a Debug dump of the Vec of glyphs, so there's plenty of room for improvement.

There's already some glyph name related code present for the dump sub-command that might be a useful reference.

ctrlcctrlv commented 3 years ago

Thanks. I'll try to open a PR to do that. :)

Considering this closed because as you showed the glyph indexes are actually correct, I was just reading the output wrong.