learningequality / kolibri

Kolibri Learning Platform: the offline app for universal education
https://learningequality.org/kolibri/
MIT License
784 stars 650 forks source link

Khmer(km) font (and other scripts with ascenders + descenders) render badly broken in Firefox #7725

Closed arky closed 3 years ago

arky commented 3 years ago

Observed behavior

Screenshot from 2020-12-14 21-36-58

Khmer language rendering is not correct in latest Firefox on Ubuntu gnu/Linux. Somehow Google-chrome handles this much better.

Developer tools console in FF reports(Not sure if this related):

[WARN: kolibri/core/assets/src/utils/setupAndLoadFonts.js] Could not load full font for 'km' setupAndLoadFonts.js:78:14 In Firefox, Noto Khmer font seems to be sanitized. downloadable font: rejected by sanitizer (font-family: "noto-full" style:normal weight:400 stretch:100 src index:0) source: http://127.0.0.1:8009/static/assets/fonts/noto-full.NotoSansKhmer.400.woff

Expected behavior

This is how the font is rendered in latest Google-Chrome. Screenshot from 2020-12-14 21-51-04

User-facing consequences

Khmer users can not use Kolibri in Firefox …

Errors and logs

Steps to reproduce

Context

radinamatic commented 3 years ago

Seeing the same errors in Firefox on Windows.

2020-12-14_16-12-58

cc @jonboiser @nucleogenesis

arky commented 3 years ago

@radinamatic This problem affects other south asian languages. https://kolibri-demo.learningequality.org/hi-in/learn/#/topics

radinamatic commented 3 years ago

After testing all the locales on the demo server in Firefox, I'm not sure if the downloadable font: rejected by sanitizer error will help, as it also appears on non South Asian languages. Sometimes it's there on the first load of the new locale, sometimes it appears if you reload the page. So far the only locales where I didn't see the error are Chinese and Korean, do they use Noto font at all?

Also worth noting that the same console message appears in Chrome too, but is treated as a warning, instead of an error as in Firefox.

However, the only locales where I managed to notice visible font break in the Firefox (not technically tofu artifacts, but rather a weird dotted circle) are:

jonboiser commented 3 years ago

I've been comparing 0.14.3 and 0.14.5 and here are some big differences I'm noticing.

0.14.3

0.14.5

image

So Chrome looks like it's trying to mix and match fonts to render the message. For native readers, the message will be legible, but the "copy-paste" quality is noticeable. Firefox might be trying just to use a less aggressive strategy to salvage the message

On Firefox, the analogous font tool shows a similar difference in messages that look "broken" (they attempt to use system font), and those that are not (they only use the noto font)

image

In a nutshell, the app is broken for multiple locales for both Chrome and Firefox, it's just more obvious on Firefox

jonboiser commented 3 years ago

I looked at the code diff between 0.14.3 and 0.14.5 and there are no changes to our font code, so my hunch now is that Google or someone might have moved some of the online resources we use to build our fonts.

@indirectlylit

radinamatic commented 3 years ago
  1. First feedback from the user:

    On Wed, Dec 23, 2020 at 2:01 AM:

    we noticed something weird: Khmer fonts are displayed wrong. We didn't notice it before because it is ok in Kolibri studio and in previous hardware, now we are testing in anything but old tablet and is displayed wrong, as in this following example:

    Right font from Kolibri studio immagine

    wrong from Kolibri server immagine2

    I know you don't read khmer but I guess you can see that + symbols should not be there.

  2. My reply:

    On Mon, Jan 4, 2021 at 12:47 PM Radina Matic radina@learningequality.org wrote:

    We have been tracking the font rendering issue on Kolibri GitHub code repository, but thought it limited to the Firefox browser. Would you confirm that you are using the Firefox browser on the tablets that you mention, and if the fonts are rendered correctly on Chrome? That information would help us debug the issue. We don't have in-house Khmer speakers, and that makes it more challenging to notice problems in the localized version. For example, if we look at the image in attachment, we can see several instances of the '+' symbol that you mention in Firefox, but also some in Chrome. Are those also incorrect? It would be of great help if you could provide as many details as possible about your setup, like make and model of the tablet, version of the browsers used, token of the channel you are testing on, etc.

    Khmer-on-Firefox-Chrome

  3. Their last reply:

    Date: Mon, Jan 4, 2021 at 8:25 AM

    The problem is quite strange as the rendering in firefox is wrong in all platforms, however is correct in some instances of Chrome and some not. The one in your snapshot is wrong, the + symbol doesn't exist in Khmer. Debug is not easy as we have two different tablets with the same Chrome version, but only one of the two is showing the fonts properly. Thanks for the GitHub link, I will forward it to our team to follow up the issue.

indirectlylit commented 3 years ago

I looked at the code diff between 0.14.3 and 0.14.5 and there are no changes to our font code, so my hunch now is that Google or someone might have moved some of the online resources we use to build our fonts.

The font files are pinned in this file:

https://github.com/learningequality/kolibri/blob/release-v0.14.x/build_tools/i18n/noto_source/manifest.json

The Khmer source files still download fine, and it shouldn't be possible for them to have changed:

    "NotoSansKhmer": {
      "bold_url": "https://raw.githubusercontent.com/googlei18n/noto-fonts/c30307083469f0c05e216ac75216fd454a517858/phaseIII_only/hinted/ttf/NotoSansKhmerUI/NotoSansKhmerUI-Bold.ttf",
      "reg_url": "https://raw.githubusercontent.com/googlei18n/noto-fonts/c30307083469f0c05e216ac75216fd454a517858/phaseIII_only/hinted/ttf/NotoSansKhmerUI/NotoSansKhmerUI-Regular.ttf"
    },

One thing worth comparing between cases that work and don't work is which of the following files are being loaded?

I can't explain why this would be a regression, but one hypothesis: in broken cases, it's using only the subset variant. Early on, I had trouble getting Hindi ligatures to display correctly for subset variants, which resulted in similar behavior to this:

image

jonboiser commented 3 years ago

The Khmer source files still download fine, and it shouldn't be possible for them to have changed:

Where is the TTF to WOFF conversion happening? Between the versions, the WOFF files seem to have been changed. The 0.14.5 WOFF files look corrupted because they aren't the right size

Compare https://kolibri-training.learningequality.org/static/assets/fonts/noto-full.NotoSansKhmer.400.woff

(200 B) compared with the full font file that you get when running 0.14.3

indirectlylit commented 3 years ago

FYI, the dotted circle is unicode U+25CC, and is used as a replacement character when a ligature or diacritic-modified character cannot be fully rendered.

Where is the TTF to WOFF conversion happening?

It happens in two stages using the fontTools library.

The TTF is loaded here:

https://github.com/learningequality/kolibri/blob/7d49bed730aa7e43cf01d61739ad1f6a3dc621e3/build_tools/i18n/fonts.py#L108

and the full woff is written here:

https://github.com/learningequality/kolibri/blob/7d49bed730aa7e43cf01d61739ad1f6a3dc621e3/build_tools/i18n/fonts.py#L288-L292

jonboiser commented 3 years ago

We suspect that this might be caused by an issue in the build system related to git LFS and the font files. We've made a temporary change to our build system that should address the issue, and I will tag a new 0.14.6 pre-release right now to do some testing.

arky commented 3 years ago

Thanks for working on this @jonboiser Let me know when this fix is release so I can do some user testing.

CC @TukTuk-Charity

radinamatic commented 3 years ago

@arky Please download and test the latest 0.14.6alpha release, and let us know if it improves the issue you've seen.

arky commented 3 years ago

@radinamatic I have tested 0.14.6alpha release on my development machine. It does fix the font issue for Khmer and Hindi.

Once the PPA repo packages are created for this release, I push build so our team could do further testing.

jredrejo commented 3 years ago

hi @arky I'm doing the ppa release, what Ubuntu version do you need to be supported?

jredrejo commented 3 years ago

Focal, Xenial, Bionic and Trusty are built and released. If you need another one, please, let me know it

arky commented 3 years ago

Thanks @jredrejo Our deployments are done on Raspberry Pi OS Lite (based on gnu/Debian Buster).

https://downloads.raspberrypi.org/raspios_lite_armhf/images/raspios_lite_armhf-2020-12-04/2020-12-02-raspios-buster-armhf-lite.zip

jredrejo commented 3 years ago

Using Buster you can choose between different ppa series. I hope you're done with the ones I released, if not, I can add the one you need.

jonboiser commented 3 years ago

This should be fixed in 0.14.6