sillsdev / ptx2pdf

XeTeX based macro package for typesetting USFM formatted (Paratext output) scripture files
23 stars 8 forks source link

Warning: Harfbuzz library versions before 8.0.0 gives rendering/layout issues with font Annapurna SIL (maybe others) #964

Open davidg-sil opened 4 months ago

davidg-sil commented 4 months ago

XeTeX uses the harfbuzz library "behind the scenes" in producing the text. We've identified a problem with the way that Annapurna is handled in harfbuzz versions 7.3.0 and lower, as found in Ubuntu 22.04, and 20.04 and at least some windows machines. Other fonts and scripts may be affected also. The bug seems to result in an under-reporting to XeTeX of how wide some words will be and/or extra or lost spaces, resulting in excursions into the margins, and layout changes. It seems to only be triggered a few times per page (maybe certain words or letter combinations), but the effects on layout can be significant.

The good news for users of ubuntu-based linux distributions is that 24.04 has a significantly updated harfbuzz library which does not show the problem.

[ Edit: it used to be unclear which version between 8.3 and 6.0 had the issue. We now know that harfbuzz 8.0.0 is the first release that does not have the issue.]

It is also possible to compile and install a new harfbuzz version (In tracking down the version I tried with multiple versions of the library on ubuntu 22.04; there are a couple of changes but its not too tricky to build) and installing the relevant .deb packages solves the problem.

Unfortunately, there are various dependencies that make it hard to try compiling a version of harfbuzz without this bug on ubuntu 20.04

markpenny commented 4 months ago

The good news is that this issue doesn't affect an Windows users for a change! :-)

davidg-sil commented 3 months ago

Update to Mark's assertion... he's seeing it with Harfbuzz 7.0.1 on Windows.

davidg-sil commented 3 months ago

I've done some bisecting... Harfbuzz 7.3 still displays the issue, Harfbuzz 8.0.0 does not. I've updated the main issue to reflect this.

davidg-sil commented 3 months ago

Instructions for the brave on ubuntu 22.04 systems. Unavailable dependencies mean this doesn't work on 20.04).

  1. Have all the relevant build tools and -dev versions of libraries installed. (Sorry, I can't remember the full list)
  2. from a relevant directory, (e.g. ~/src) git clone https://salsa.debian.org/freedesktop-team/harfbuzz and cd into harfbuzz directory.
  3. git checkout debian/8.3.0-2 (you may pick another version if you want to try... )
  4. Comment out call to chafa_set_n_threads, if you don't have that function:
    
    diff --git a/util/hb-info.cc b/util/hb-info.cc
    index e514f9055..d0cdcaa12 100644
    --- a/util/hb-info.cc
    +++ b/util/hb-info.cc
    @@ -1250,7 +1250,7 @@ struct info_t :
     free (palette);
     palette = nullptr;
devosb commented 3 months ago

The current build of TeX Live 2024 (which is where the XeTeX binary for the Windows build of PTXprint) comes from) uses HarfBuzz 8.3.0. I had not updated since I thought @mhosken said there was an issue with TeX Live 2024.

devosb commented 3 months ago

I wonder if the change in HarfBuzz was intentional. If not, then there would not be a test that covers this issue and therefore a possibility that future versions of HarfBuzz could revert to the older, broken behaviour. I would suggest filing an issue with HarfBuzz, so that the commit that introduced the behaviour can be found and more importantly a test can be added.

mhosken commented 1 month ago

There are a number of things that if changed can cause layout differences in a project. The version of harfbuzz is one of these, even in Latin script!