Proposal: maintain a list of text runs as OpenType shaping test cases

Lorp commented 3 years ago

We often hear during informal conversations of interesting cases of text runs that illustrate potential issues in text shaping stacks, be it in Unicode, or in a particular shaping engine, or the building of OpenType Layout tables. This led me to propose during Font Text CG Meeting 7 (2021-01-12) that a list of interesting text runs be centrally mainained on GitHub for the purposes of testing on shaping engines, whether in production or development.

So in this issue I want to note down some initial ideas about how it could work.

Unicode maintains a GitHub repo containing a list of text runs (potentially 1000s).
Test fonts may be hosted, but common proprietary fonts need to be referenced too.
Each text run would be executed regularly on multiple platforms (hopefully all widely used platforms).
The results of each rendering would be published in the repo.
Results as PNG, SVG, JSON (glyphIDs and x,y offsets).
A web front end presents all text strings along with visual results by platform.
Platform vendors operate APIs such that new text strings are rendered and published at least daily.
Platform vendors set up a server that polls Github for new text strings and contribute renderings.
The above platform APIs render past, current and canary versions.
The above platform APIs allow for live usage for font software engineers such as people in FTCG.
Native format of the text runs to be decided. HTML? Should all test cases be little websites?
No fundamental need to limit this to OpenType, so potentially show ACE renderings too.

There’s precedent in OpenType Variations text rendering tests, maintained by @brawer, and which I understand were very useful.

I’d be interested to hear comments.

simoncozens commented 3 years ago

Your idea currently looks a bit end-to-end, in that you have both shaping and rendering. These are separate processes and it may make sense to test them separately. In terms of a repository of interesting fonts and strings which trip up the shaping engine, the Harfbuzz test suite is already fairly extensive. (I’m using it to test my own shaping engine.)

Lorp commented 3 years ago

Thanks for the comment. Can you link to the Harfbuzz test suite?

simoncozens commented 3 years ago

In-house shaping texts: https://github.com/harfbuzz/harfbuzz/tree/master/test/shaping/texts/in-house
Shaping tests with expectations: https://github.com/harfbuzz/harfbuzz/tree/master/test/shaping/data/in-house/tests
AOTS tests: https://github.com/harfbuzz/harfbuzz/tree/master/test/shaping/data/aots/tests

Lorp commented 3 years ago

Thanks for these Simon, that’s useful in itself. What I am proposing would give font makers and shaper authors an indication of why that text string is present in the list. What is desirable? What is bad? What do various implementations do with various fonts?

Lorp commented 3 years ago

Sorry – I see the second link does indeed define glyph positioning expectations for particular fonts and character sequences.

If shaper implementors find this format sufficient, and visual feedback is not useful for validation by human script experts, then I suppose my proposal is moot. Do implementors other than HarfBuzz implementors make use of these lists?

simoncozens commented 3 years ago

I use the Harfbuzz test suite as-is for testing the shaper library inside fontFeatures. YesLogic uses the aots suite, and then also checks its output against Harfbuzz's output.

simoncozens commented 3 years ago

I would also say that visual feedback doesn't scale. The Harfbuzz suite has over a thousand tests in it. It's much better to have an automated system than to have to wade through those and compare them visually.

frivoal commented 3 years ago

@simoncozens Presumably other engines than HarfBuzz have equivalent test repositories. Are other ones open / public as well? Could they be encouraged to become public if they are not yet? Could different engines start sharing their tests, similarly to how browser engines share their tests in https://github.com/web-platform-tests/wpt?

simoncozens commented 3 years ago

Here's the situation for the shaping engines that I'm aware of:

Harfbuzz: Open source test suite as described above.
fontFeatures.shaperLib: Uses Harfbuzz's test suite.
YesLogic Allsorts: Uses the aots suite, maintains a corpus of text - test suite shapes the text with Allsorts and Harfbuzz and compares the output. Also has a small number of Rust-based shaping tests
Uniscribe / DirectWrite : Closed source tests. (Presumably)
CoreText : Closed source tests. (Presumably)
FontForge: No shaping tests.

brawer commented 3 years ago

I would also say that visual feedback doesn't scale.

FWIW, the Unicode text rendering test suite has automated “visual” conformance tests. For each test case, the engine under test is asked to compute Bézier paths for the shaped glyphs. Then, the test suite compares the observed Bézier path against the expected outcome. The test suite still accepts a small deviation in control point positions, currently 2 font units, so that implementations may slightly differ in their rounding. Finally, the expected and observed Bézier paths get converted to SVG for inclusion in the test reports. For an example with lots of test failures, see this report here.

By the way, it also really useful to include tests such as GSUB-3 or MORX-34 whose only expected outcome is that the engine doesn’t crash or hang.

Contributions very welcome (actually, if someone wants to take over maintenance, do tell; I don’t have much time for this these days). Most existing tests are currently about shaping, but it doesn’t have to stay that way. To avoid the copyright problem with proprietary fonts (mentioned in the initial post on this issue): When coming across a font that triggered a bug in some shaping engine, we’ve either reverse engineered the problem into a small test font with custom OpenType tables and fresh glyph designs, or else we’ve asked the copyright owner to contribute a subsetted font to Unicode. The copyright owners were usually very responsive and helpful, since Unicode is considered a non-threatening organization.

w3c / font-text-cg

Proposal: maintain a list of text runs as OpenType shaping test cases #41