notofonts / balinese

Noto Balinese
SIL Open Font License 1.1
4 stars 2 forks source link

Balinese shaping issues #22

Closed jungshik closed 5 years ago

jungshik commented 8 years ago

Spun off from notofonts/noto-fonts#543

@kmansourMT gave us some test strings he used. I shaped them with harfbuzz and used two versions of Noto Sans Balinese : the current version with both DFLT and non-DFTL OT tables) and my local version with ONLY DFLT.

The latter looks better, but both have shaping issues. I'll add more details shortly.

We need to sort this out.

/cc @roozbehp @behdad

jungshik commented 8 years ago

I made a test file based on the VOLT screenshots sent by @kmansourMT.

test.uni.txt : test input (\u-escaped)

test.txt : test input in UTF-8

Left is the rendered result with the current (1.03: both DFLT and non-DFLT) font and right is the result (DFLT-only). When non-DFLT (Balinese) is present, harfbuzz routes it to USE.

image

jungshik commented 8 years ago

As you can see in the screenshot, both versions of the font have issues but often they have different issues.

@kmansourMT : The first syllable in the first line has U+1B13 U+1B00 U+1B38. Is it correct to have U+1B00 between U+1B13 (consonant) and U+1B38 (dependent vowel)? I couldn't find any information as to where in a syllable U+1B00 should go. If it's like signs in other Brahmi-derived scripts, it should go after a vowel, shouldn't it?

I'm gonna try it on Windows 10 + Edge. BTW, what layout engines were used to test Balinese?

jungshik commented 8 years ago
jungshik commented 8 years ago

@kmansourMT : I have two questions in the above comment:

  1. Is "U+1B13 U+1B00 U+1B38" in Line 1 a valid sequence? Can U+1B00 come between a consonant and a dependent vowel? USE seems to consider it illegal. ( balinese-volt-proofing.PNG )
  2. Line 4: Can two dependent vowels come in a row? ( Bal_VOLT_reorder-final1.PNG )
jungshik commented 8 years ago

Below is the screenshot of IE 11 on Windows 10. The current version works better, but is not perfect. The two questions I asked in the previous comment are still applicable.

image

jungshik commented 8 years ago

This is Edge on Windows 10 rendering http://jungshik.github.io/noto/balinese/test.html. Interestingly, Edge's rendering/shaping is different from that of IE 11 on Windows 10. Edge is closer to harfbuzz (and browsers using harfbuzz - Chrome and Firefox) but is still different from harfbuzz.

screen shot 2015-11-19 at 10 13 38 am

jungshik commented 8 years ago

/cc @tiroj

jungshik commented 8 years ago

I updated http://jungshik.github.io/noto/balinese/test.html to have links to Volt proofing images from @kmansourMT

roozbehp commented 8 years ago

I just looked at Line 1 and Line 4. The order of the characters in both cases should be changed. See the USE spec at https://www.microsoft.com/typography/OpenTypeDev/USE/intro.htm:

For Line 1, USE expects Bindus (VM) to appear after vowels. So the test string should be <U+1B13, U+1B38, U+1B00> instead.

For Line 4, USE excepts left-side vowels (VPre) to appear before above vowels (VAbv) and right-side vowels (VPst). So the test string should be <U+1B13, U+1B3E, U+1B36>etc.

jungshik commented 8 years ago

Thanks, @roozbehp. I updated my test page. Line 1a and line 4 have updated sequences per @roozbehp's comment and they're shaped well.

Back to @kmansourMT : @roozbehp told me that left-vowel sign should come before other vowels (when two vowels come in a row) unless they form a 'diphthong'.

Have you seen <U+1B36, U+1B3E> and other pairs (where left-vowel sign such as U+1B3E or U+1B3F is put after another non-left vowel sign) in your test (lines 4) form a diphthong? Is that why you're testing those sequences (the 2nd and 3rd syllables in http://jungshik.github.io/noto/balinese/Bal_VOLT_reorder-final1.PNG )?

@roozbehp : how does USE handle 'diphthong' cases (where visually 'left vowel sign' should come after another vowel sign )?

jungshik commented 8 years ago

/cc @NorbertLindenberg

BTW, a test adopted from @NobertLindenberg's (#163) shows that reordering with U+1B3[EF], U+1B4[01], etc does not happen with harfbuzz when non-DFLT opentype tables are dropped. That is, reordering with dist in DFLT does not work in harfbuzz. OTOH, Edge on Win 10 seems to be fine with or without non-DFLT table.

See the screenshots below taken of http://jungshik.github.io/noto/balinese/left_vowel.html (top in each cell uses Noto Sans Balinese with both DFLT and non-DFLT and bottom in each cell uses the font without non-DFLT).

kmansourMT commented 8 years ago

Regarding: <<Have you seen <U+1B36, U+1B3E> and other pairs (where left-vowel sign such as U+1B3E or U+1B3F is put after another non-left vowel sign) in your test (lines 4) form a diphthong? Is that why you're testing those sequences (the 2nd and 3rd syllables in http://jungshik.github.io/noto/balinese/Bal_VOLT_reorder-final1.PNG )? >>

At the time of testing, our designer did not know details about vowel ordering to such an extent, but was primarily interested in verifying the reordering.

roozbehp commented 8 years ago

@roozbehp : how does USE handle 'diphthong' cases (where visually 'left vowel sign' should come after another vowel sign)?

I'm not sure I understand the question. But I try to answer anyway:

a) If the question is what happens if there's a left-side "I" vowel and a right-side "E" vowel and the language uses both to represent the "EI" linguistic diphthong and it's written visually as <I><CONSONANT><E>, USE basically says ignore the actual linguistic order and encode it is <CONSONANT><I><E>.

b) If the question is how one would represent the visual sequence <CONSONANT><I><E> where "I" is a left-side vowel and "E" is a right-side vowel, I've yet to see such a sequence on paper. But it indeed exists, one needs to bring it to the attention of Unicode and OpenType Layout communities.

jungshik commented 8 years ago

@roozbehp Thank you. My question was a) and you answered the question although I'd not like having to type <consonant><I><E> for <consonant><EI> if I were a speaker of such a hypothetical language ;-). Well, it's not likely that there's such a language-script pair.

jungshik commented 8 years ago

At the time of testing, our designer did not know details about vowel ordering to such an extent, but was primarily interested in verifying the reordering.

Ok. Are there any other strings in http://jungshik.github.io/noto/balinese/test.html that you know are not valid? We know that notofonts/khmer#10 (the 1st syllable) is invalid.

jungshik commented 8 years ago

To sort out font issues and potential harfbuzz issues, I'm putting up the shaping result by Edge on Win 11 (with the latest test page http://jungshik.github.io/noto/balinese/test.html ) and Chrome (harfbuzz):

kmansourMT commented 8 years ago

Had I known of these errors, I would have told you.

From: jungshik notifications@github.com<mailto:notifications@github.com> Reply-To: googlei18n/noto-fonts reply@reply.github.com<mailto:reply@reply.github.com> Date: Friday, 20 November 2015 at 15:36 To: googlei18n/noto-fonts noto-fonts@noreply.github.com<mailto:noto-fonts@noreply.github.com> Cc: Kamal Mansour kamal.mansour@monotype.com<mailto:kamal.mansour@monotype.com> Subject: Re: [noto-fonts] Balinese shaping issues (#572)

At the time of testing, our designer did not know details about vowel ordering to such an extent, but was primarily interested in verifying the reordering.

Ok. Are there any other strings in http://jungshik.github.io/noto/balinese/test.html that you know are not valid? We know that #1https://github.com/googlei18n/noto-fonts/issues/1 (the 1st syllable) is invalid.

— Reply to this email directly or view it on GitHubhttps://github.com/googlei18n/noto-fonts/issues/572#issuecomment-158557831.

jungshik commented 8 years ago

Given the result shared in https://github.com/googlei18n/noto-fonts/issues/572#issuecomment-158246038 ( <consonant> <left-vowel> reordering does not work at all with non-DFLT tables dropped in harfbuzz), I'll focus on the results obtained with both DFLT and non-DFLT tables present. (harfbuzz will use USE in that case for Balinese).

Line 1a: USE on Windows matches harfbuzz. Everything seems all right (except for the position of an 'above-base mark' related to notofonts/balinese#21 )

Line 2: ditto

Line 3: 2nd and 3rd syllables have two 'below-base marks' overlapping each other. (related: notofonts/balinese#20) in both Windows USE and harfbuzz

Line 4a

Line 5

Line 6

Line 7: hb fails to reorder U+1B3E (left-vowel) wrt a conjunct in the 2nd syllable

Line 8 and 9 : Both USE and hb are fine.


@behdad, can you take a look at the following three? Thanks. (the test page :+1: http://jungshik.github.io/noto/balinese/test.html ).

behdad commented 8 years ago

@behdad, can you take a look at the following three? Thanks. (the test page :+1: http://jungshik.github.io/noto/balinese/test.html ).

For lines 4a, 5 and 7, hb disagrees with USE on Windows and USE is correct afaict. So, it's likely to be a harfbuzz issue. I'll file a bug against harfbuzz.

Lines 3: Both USE and harfbuzz have the same bad shaping result. It's not yet known whether it's a font or shaping engine.

Line 6: USE and harfbuzz do not agree, but both fail to reorder. Again, it can be either a font issue or an engine issue.

@jungshik please file a harfbuzz bug with just enough details to reproduce it. Jonathan and I will take a look at them in London in December. Thanks

jungshik commented 8 years ago

The first issue in my previous comment (lines 4a, 5 and 7) was filed against harfbuzz as shown above.

The 2nd and the 3rd issues are tricky because it's not clear who to blame, font or engine.

@kmansourMT, have you ever gotten the expected shaping result with any engine for line 3 and line 6?

NorbertLindenberg commented 8 years ago

Jungshik, I assume wherever you said "Javanese" in this issue, you really meant Balinese?

NorbertLindenberg commented 8 years ago

Is there any documentation from Microsoft on how the script-specific and DFLT feature lists interact with script engines in OpenType? I haven't been able to find such documentation. In particular, once the renderer has identified a run of, say, Balinese characters:

– If the renderer doesn't have a script engine supporting Balinese, does it still use feature lists for script "bali", or are those ignored and "DFLT" used instead?

– If the renderer does have a script engine supporting Balinese (the USE), but the font doesn't have feature lists for the script "bali", does the renderer still use the script engine supporting Balinese (the USE), or does it use a fallback engine that doesn't know anything about reordering Balinese vowels?

Answers to these questions might help me understand some of the behavior seen here.

NorbertLindenberg commented 8 years ago

Sequences of two are more dependent vowels are allowed by the USE and worth testing. Some sequences occur in real life: 1B3A 1B35, 1B3C 1B35, 1B3E 1B35, 1B3F 1B35, 1B42 1B35. As Roozbeh said, the USE requires them to appear them in a specific order.

jungshik commented 8 years ago

Sequences of two are more dependent vowels are allowed by the USE and worth testing.

Line 4a has a couple of cases for that. Line 4 has the order reversed and didn't work with USE/harfbuzz. Line 4a (order corrected) works with both. I can try more.

jungshik commented 8 years ago

Is there any documentation from Microsoft on how the script-specific and DFLT feature lists interact with script engines in OpenType?

In case of Harfbuzz, if 'bali' is present in a font, hb's implementation of USE is used. If not, the default shaping engine is used (with DFLT tables).

jungshik commented 8 years ago

Multiple vowels in a row

Test file (line 1 has two vowels in a row. line 2 uses NFC form) multi_dep_vowels.txt

  1. hb's USE (Noto Sans Balinese with both DFLT and non-DFLT/bali present): multi_dep_vowels old
  2. hb's dflt (Noto Sans Balinese with DFLT only): multi_dep_vowels new

With both DFLT and bali, hb's USE is used and it works as expected. With only DFLT, reordering is broken.

kmansourMT commented 8 years ago

Based on all the exchanges above, the sequences that appear to be functional errors in the font are summarized by screen shot 2015-12-04 at 16 50 26

Basically, the sequences that need further verification/correction consist of the pattern: base consonant + subjoined consonant + {vowels u1B38–1B3D}

jungshik commented 8 years ago

@kmansourMT : There are more issues than what you referred to in the previous comment.

What you wrote about above is line 3 in my comment ( https://github.com/googlei18n/noto-fonts/issues/572#issuecomment-158562355 ).

In addition, line 1a and 2 also have issues ( bug notofonts/noto-fonts#380 ). So does line 6.

KrasnayaPloshchad commented 8 years ago

HarfBuzz got improvement for Balinese, and the improvement is landed in 1.1.3.

kmansourMT commented 8 years ago

In the forthcoming version of Noto Balinese, we have corrected the problems previously encountered with the pattern "base consonant + subjoined consonant + {vowels u1B3C,1B3D}". The following demonstrates the changes that have been applied.

183c-subscripts-corrected

jungshik commented 8 years ago

@kmansourMT : When do you plan to deliver the aforementioned update? The latest I have in phase 2 (TTF) is from Sep 2015. ( https://github.com/googlei18n/noto-source/tree/master/src for Phase 3 does not have Balinese source, either)

Anyway, I guess you fixed these and related sequences.

U+1b13 U+1b44 U+1b13 U+1b3c 
U+1b13 U+1b44 U+1b13 U+1b3d

Does your update handle the following sequence as well?

U+1b13 U+1b44 U+1b13 U+1b38 
jungshik commented 8 years ago

Oh... U+1b13 U+1b44 U+1b13 U+1b38 appears to have been ok even in Sep 2015 version.

kmansourMT commented 8 years ago

Susan W. is in charge of scheduling releases and updates.

From: jungshik notifications@github.com<mailto:notifications@github.com> Reply-To: googlei18n/noto-fonts reply@reply.github.com<mailto:reply@reply.github.com> Date: Monday, 6 June 2016 at 11:48 To: googlei18n/noto-fonts noto-fonts@noreply.github.com<mailto:noto-fonts@noreply.github.com> Cc: Kamal Mansour kamal.mansour@monotype.com<mailto:kamal.mansour@monotype.com>, Mention mention@noreply.github.com<mailto:mention@noreply.github.com> Subject: Re: [googlei18n/noto-fonts] Balinese shaping issues (#572)

@kmansourMThttps://github.com/kmansourMT : When do you plan to deliver the aforementioned update? The latest I have in phase 2 (TTF) is from Sep 2015. ( #572https://github.com/googlei18n/noto-fonts/issues/572 for Phase 3 does not have Balinese source, either)

Anyway, I guess you fixed these and related sequences.

U+1b13 U+1b44 U+1b13 U+1b3c U+1b13 U+1b44 U+1b13 U+1b3d

Does your update handle the following sequence as well?

U+1b13 U+1b44 U+1b13 U+1b38

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/googlei18n/noto-fonts/issues/572#issuecomment-224051316, or mute the threadhttps://github.com/notifications/unsubscribe/AMJehQ0inEh8ZnTIhvk_uZeYYIkV1qV8ks5qJGttgaJpZM4Gk8qB.

jungshik commented 8 years ago

Summary of this bug so far:

@waksmonskiMT, when do you plan to deliver the update @kmansourMT mentioned that fixes both "line 3" issue and lines 1a/2 issue (also bug notofonts/noto-fonts#380)? With the font update, I can close this bug after verifying the fix.

jungshik commented 8 years ago

off-line conversation with @waksmonskiMT: we'll get @kmansourMT's fix in upcoming phase 3 delivery.

behdad commented 7 years ago

off-line conversation with @waksmonskiMT: we'll get @kmansourMT's fix in upcoming phase 3 delivery.

Any updates?!? It has been over a year.