notofonts / batak

Noto Batak
SIL Open Font License 1.1
2 stars 0 forks source link

Incorrect glyph for 'u' #7

Open bennylin opened 1 year ago

bennylin commented 1 year ago

Title

Incorrect glyph for 'u'. Writing this bug report on behalf of https://incubator.wikimedia.org/wiki/User_talk:Surung_Simanullang

Font

Full file name, for example 'NotoSansBatak-Regular.ttf'. You can upload the problem font here unless it is a Chinese, Japanese or Korean font (these are large). NotoSansBatak-Regular.zip

Where the font came from, and when

For example: Site: I believe from https://fonts.google.com/noto/specimen/Noto+Sans+Batak, that's where I usually download Noto fonts from. But I'm not 100% sure. Date: 2021-04-01 (according to the file date on my folder)

Font Version

  • Win -- 3.1, August 2, 2020

OS name and version

This is especially important if the font came pre-installed.

Application name and version

If the issue is observed using a specific app.

Issue

Summarize the issue briefly -- one paragraph preferred

  1. Write ᯖᯪᯀᯮᯰ ᯉᯪ ᯔᯉᯮᯂ᯲ in Noto Sans Batak (image 1), translit 'tiung ni manuk'
  2. The glyph for letter ᯀᯮ is incorrect, since it didn't display as the ᯀ (1BC0) + ᯮ (1BEE)
  3. Observed results (see image 1)
  4. Expected results: should look like ᯮ at the bottom right (see image 2)
  5. Additional information

    Example from Pustaha Laklak (image 3) Add MS 15678, f. 10r https://www.bl.uk/manuscripts/Viewer.aspx?ref=add_ms_15678_f001r

Character data

Please include real character data to illustrate your issue-- Unicode codepoints are helpful. This makes it possible for developers who don't know the language or script to copy/paste the text to reproduce the issue.

  • ᯀ (1BC0) + ᯮ (1BEE) = ᯀᯮ , translit 'u'
  • which, according to Batak speaker, Surung_Simanullang, only occur as 'ung'
  • ᯀ (1BC0) + ᯮ (1BEE) + ᯰ (1BF0) = ᯀᯮᯰ, translit 'ung'
  • ᯖ (1BD6) + ᯪ (1BEA) + ᯀ (1BC0) + ᯮ (1BEE) + ᯰ (1BF0) = ᯖᯪᯀᯮᯰ in ᯖᯪᯀᯮᯰ ᯉᯪ ᯔᯉᯮᯂ᯲ translit 'tiung ni manuk'

Screenshot

If possible, include a screenshot or an image illustrating the issue. Annotations are also helpful.

image Image 1

image Image 2

image Image 3

simoncozens commented 1 year ago

I see what has happened. The Unicode proposal has this chart:

Screenshot 2023-07-13 at 20 47 12

Notice how "hu" and "bu" keep the zig-zag shape of the ᯮ - but a+u does not. This may be a mistake; I am not sure, but I imagine that Noto Batak may have been implemented according to the Unicode proposal, not according to the manuscript evidence. We would probably need a little more research to see whether the current form is ever used, or if it is a mistake.

r12a commented 1 year ago

Fwiw, other -u ligatures also invert the direction of the -u vowel strokes. Besides the a+u and Mandailing hu above, they include gu, and wu (see https://r12a.github.io/scripts/batk/btk.html#u_ligatures).

We would probably need a little more research to see whether the current form is ever used, or if it is a mistake.

I'm no expert, but given the propensity for -u to ligate in various ways in Batak, but also in certain other scripts (eg. Tamil), it's not surprising to me that these ligatures may look slightly different.

bennylin commented 1 year ago

Fwiw, other -u ligatures also invert the direction of the -u vowel strokes. Besides the a+u and Mandailing hu above, they include gu, and wu (see https://r12a.github.io/scripts/batk/btk.html#u_ligatures).

We would probably need a little more research to see whether the current form is ever used, or if it is a mistake.

I'm no expert, but given the propensity for -u to ligate in various ways in Batak, but also in certain other scripts (eg. Tamil), it's not surprising to me that these ligatures may look slightly different.

as well as 'lu', which should be on the bottom-right corner. I was going to submit a new bug report, but since this matter has been brought up here, I will submit the examples here as well.

image

all of these are brought to light by Surung Simanullang, because he's the expert on this. Hopefully some day he will be able to join our conversation here (you just need to create an account on github).

Exhibit 1: ADD MS 15678 p 8 Text 1: ᯑᯮᯰᯎᯮᯒᯬᯉ᯲ Translit 1: dungguron Text 2: ᯑᯬᯂᯬᯖ᯲ ᯇᯬᯎᯮ ᯉᯪ ᯀᯞᯔᯉ᯲ Translit 2: dohot pogu ni alaman

image image

Exhibit 2: ADD MS 15678 pp 13-14 Text 1: ᯘᯎᯮ Translit 1: sagu (continued on the next page, the complete word reads 'sagusagu' or 'ᯘᯎᯮᯘᯎᯮ') Text 2:

image image image

simoncozens commented 1 year ago

OK, I'm unclear about the resolution here. I can see that the manuscript forms tend to keep the direction of -u, whereas the Unicode proposal and code charts have a bit more flexibility in how -u gets ligated; we use the forms from Unicode. Could this be a unification issue? Do we need stylistic sets?