Unable to build a matching Gulzar 1.002

MattMatic commented 1 month ago

Even after making adjustments to the makefile, the font does not match released Gulzar.

As an example: سائنس (science) creates a clash in the HAMZA_ABOVE and the adjacent Nukta.

Examining the debug build with Crowbar (super helpful!!), there is an additional GSUB rlig stage after "connections" that invokes "DotAvoidance_reverse_top" that appears to detect "BEmsd2" and "HAMZA_ABOVE" and incorrectly change to "HAMZA_ABOVE.one" - creating the clash.

This appears to originate from karakul's DotAvoidance.py - which detects "Old TE schema", and from what I understand triggers a dot_combinations rule (though I'm not 100% sure).

Is it possible I have is a Python library mistmatch since 2022-Nov-14?

I have tried with Gulzar 1.000, Gulzar 1.001 - all with the same results.
I've tried pulling the libraries back to specific commit points that were in place on 2022-Nov-14 - same results.

What have I missed?

MattMatic commented 1 month ago

Built Gulzar (debug): Screenshot 2024-06-10 093608

Released Gulzar 1.002: Screenshot 2024-06-10 093621

Crowbar for Built (debug): Screenshot 2024-06-10 093426

Crowbar for Released: Screenshot 2024-06-10 093444

Note the extra GSUB rlig stage before the GPOS that involves the DotAvoidance_reverse_top stage.

MomongaFont commented 1 month ago

@simoncozens

I'm interested in Gulzar, and I've created a ttf file by adjusting the Makefile. However, I've noticed that the font does not match between my generated ttf and the released Gulzar.ttf file.

For example, I've noticed that the HAMZA_ABOVE and the adjacent Nukta are clashing with the two letters سائنس (science) and سائنسی (scientific). I think this might be caused by the DotAvoidance.py script in karakul, just like @MattMatic mentioned.

So, it would be appreciated if you could check the following points.

Is there anything I've missed?
Should I adjust the fez files based on Gulzar's engineering.md to avoid clashes?

Thank you.

simoncozens commented 1 month ago

I'm sorry about this; I should really have been more diligent about pinning Python dependency versions for the release build. In my defence I was building a lot of the tooling on the fly and wanted to make sure I always had the latest versions in the build...

I will try and take a look at this but it's going to take me a while to get around to it and then to get my head back into the Gulzar headspace - sorry...

MattMatic commented 1 month ago

I completely understand! Really appreciate any help at all. 😊 One nice side effect is that I'm beginning to get my head into the right space. 🤯

Am switching to build on macOS as well. It's highlighted some other issues, but when built it still produces the same issue as WSL.

FWIW, have installed libraries only under venv, and these seemed crucial:

rustup needs to be installed
HarfBuzz needs to be installed for the command line tools (I used home-brew)
Created venv with python3 -m venv venv
Libraries installed in this order (IIRC) -- fontbakery 0.8.13 (otherwise Maturin errors about config.toml) -- babelfont 3.0.1 (unsure if this is a requirement) -- fontFeatures 1.7.4 -- fez-language 1.3.4 -- then install pip install -r requirements.txt while under venv

Am currently working backwards from DotAvoidance to rules.csv and back to Gulzar.glyphs to understand what might have happened.

MomongaFont commented 4 weeks ago

@simoncozens

Thank you so much for your quick reply! I really appreciate it. I'm looking forward to hearing from you, and I'll also keep researching in the meantime.

Thank you.

MattMatic commented 3 weeks ago

@simoncozens I'm not sure if this is relevant, but I found it interesting...

I tapped into DotAvoidance.py of karakul to build an SVG output of the collisions so I get some insight into the test above. (The marks-only on the left that are checked, and the full set of glyphs in the test sequence shown on the right)

If I understand this correctly, the nastaliqConnections from Gulzar.glyphs is extracted into rules.csv, and then a sequence of all combinations of marks are tested for collisions - but only at the font's design positions. What's interesting is that the DotAvoidance appears to be checking for collision in the default Y position of the marks (i.e. at an early stage)/

The SVG for HAMZA_ABOVE + sda is colliding by the smallest amount - wonder if this changed somewhere down the line with the Python libraries - but the later GPOS would've made them not collide anyway.

Is the early collision check done for the benefit of GSUB/GPOS building, or could the moving of the marks be handled in GPOS as a last stage? (Even thinking about either option is frying my head... so please discard if this is nonsense!)

MattMatic commented 2 weeks ago

Update: I believe collidoscope has an error in the scale_path method. It looks like each path is being centered on its own bounding box, rather than on the overall glyph bounding box.

Background: (Simon - I know you know this, but the debug process might help others)

I started working through Fez + karakul and trying to understand how everything fits together. In DotAvoidance.py there's position_glyphs, and I see that it's very neatly pulling out the anchors from the Glyphs font, working out exit-to-entry cursive points, and matching up the "top" anchor of a base glyph to the "_top" anchor of the mark glyph. And Fez has nicely created additional anchors for "top.one", "top.two" etc. Very cool. 😎

Wrote some rough code to output the state of the paths and glyphs and positions, and try to tie them all back to what should be output.

After a few hours of puzzling, I notice the "BEmsd2" glyph shown above in red is wrong. Really wrong. The 'tail' is almost separate from the 'body'. 😟

So, I manually check the paths in Gulzar.glyph, and compare manually to B_E_msd2.glif - seem to match almost 100% I only have FontLab 8, but it pulls in the UFO format and shows pretty much the same glyph (Bezier issues notwithstanding), and can see clearly the two paths that should overlap. I dig into DotAvoidance to spit out an SVG of just "BEmsd2" - and it's wrong. 😟

Then I work through and find that collidoscope is scaling each path. To test this theory, I hack into DotAvoidance to do this:

        paths = self.c.get_beziers("BEmsd2")
        self.report_txt.write("Beziers:%s\n" % (paths))
        self.report_txt.write("\t<svg preserveAspectRatio=\"xMidYMax\" width=150 height=100 viewBox=\"120 330 474 262\">")
        for p in paths:
            self.report_txt.write("<path d=\"%s\" fill=\"red\"/>\n" % (p.to_svg()))
        self.report_txt.write("</svg>\n")

After a bit of manual cut-n-paste into HTML, I get the SVG output:

So, then I add the scale call:

        paths = self.c.get_beziers("BEmsd2")
        paths = [self.c.scale_path(p) for p in paths]  # ----whoops!
        self.report_txt.write("Beziers:%s\n" % (paths))
        self.report_txt.write("\t<svg preserveAspectRatio=\"xMidYMax\" width=150 height=100 viewBox=\"120 330 474 262\">")
        for p in paths:
            self.report_txt.write("<path d=\"%s\" fill=\"red\"/>\n" % (p.to_svg()))
        self.report_txt.write("</svg>\n")

And get the dud output:

This is definitely a big part of the build mismatch issue.... I suspect even though karakul DotAvoidance is placing the individual glyphs in the right space, the paths have been wrongly constructed and are creating a collision that shouldn't happen.

@simoncozens - I hope this helps short-cut some of your dev time and checks!

MattMatic commented 2 weeks ago

...and setting the scale_factor = 1.00 in the call in DotAvoidance seems like it's working (though some extra tolerance gap would probably be a good idea - like the original 1.22 value) 😎

        self.c = Collidoscope(self.parser.font, { "marks": True, "bases": False, "faraway": True}, scale_factor = 1.00)

Will give some more thought to the scaling function of collidoscope, and will resume other tests later.

MattMatic commented 2 weeks ago

(Apologies if this is too much info)

By comparing the centroids of each step of the scale_factor, it appears that combining the path transformations in one line produces the wrong result.

(scale_path of collidoscope's __init__.py)

        transform = out * scale * in_
        return transform * p

Will push the x,y origins further out, relative to the scale_factor. Showing 2.0x, 1.5x, and 1.0x

Separate Transformations

Executing the transformations one at a time produces the expected centroid-centric result: (scale_path of collidoscope's __init__.py)

        p2 = out * p
        p2 = scale * p2
        p2 = in_ * p2
        return p2

I would argue that using a centroid is not equivalent to expanding the stroke outwards - which should be the behaviour for collision detection. Other geometry libraries refer to this as a "buffer", while photo editors might use the term "expand" (as if the path has been stroked by a larger pen).

Summary

Likely issue: the transformation is producing unexpected results (in collidoscope)
Possible issue: the centroid scaling is not the same as stroke expansion
Possible issue: scale_factor of 1.22 might be too large? (but this seems to work with corrected geometry)

googlefonts / Gulzar

Unable to build a matching Gulzar 1.002 #125

Separate Transformations

Summary