notofonts / mongolian

Noto Mongolian
SIL Open Font License 1.1
12 stars 3 forks source link

Bugs with displaying Todo Mongolian correctly in NotoSansMongolian #21

Closed todbichig closed 5 years ago

todbichig commented 6 years ago

Defect Report

Title

Noto sans Mongolian: Three bugs related to displaying Todo bichig

Font

NotoSansMongolian-Regular.ttf

Font Version

I have used this font on Mac, Linux and Android and have had the same issues on all platforms.

Issue

There are three main bugs for properly displaying Todo script.

  1. The ᡅ (i) should have an extra 'tooth' when following a vowel (including itself). An example is shown below with the word ᠨᠠᡏᡅᡅᠨ (namiin). It should be displayed like 1b but displays as 1a. 1a 1b

  2. Another problem is that when writing two of this character ᡇ (u), the second instance shouldn't keep it's 'crown'. Please see below the word ᡋᠠᡅᡎᡇᡇᠯᡇᠨ (baiguulun) and ᠨᠠᡅᠷᡇᡇᠯᡇᠨ (nairuulun): 2a and 3a should look like 2b and 3b respectively. (Note also the first bug is also fixed in these pictures) 2a 2b 3a 3b

3 The third issue is ᡎ (g) should only keep its circle if it is followed by a ᠠ,ᡆ, or ᡇ. If it is followed by a consonant it should change to carry the same mark that it currently carries only in its final form. Please see ᡍᡄᠯᡄᡎᠰᡄᠨ (kelegsen) 4a (incorrect) and 4b (how it should show up). The marker change can also be found by writing a word like ᠴᡆᠷᡆᡎ (zorog). 4a 4b

Character data

These are issues with the correct display of the following unicode characters Mongolian letter todo i (U+1845) Mongolian letter todo u (U+1857) Mongolian letter todo ga (U+ 184E)

todbichig commented 6 years ago

There is actually a fourth issue as well.

The ᠨ (n) should drop its dot in all instances where a non-initial n is followed by a consonant. (It should also drop its dot at the end of the world like it already does).

I believe the font is currently set up for the n to drop its dot when followed by a standard Mongolian consonant, but when followed by a Todo specific consonant it retains its dot making the text not display correctly.

todbichig commented 6 years ago

The fifth bug is with regards to ᡃ (Mongolian letter Todo long vowel sign) (U+1843). It follows the vowels a, o, or ü (ᠠ︐ ᡆ︐ ᡈ) it should not change the ending. for a it makes the normal a ending come a little later and o and ü don't have endings.

The sixth bug is related specific when these three characters follow each other ᡎ ᠯ ᠠ (g + l + anything) (U+184E followed by U+182F followed by another letter). The first letter should change its circle as already documented above but it shouldn't change form completely when adding another letter to the string.

kmansourMT commented 5 years ago

The fifth bug is with regards to ᡃ (Mongolian letter Todo long vowel sign) (U+1843). It follows the vowels a, o, or ü (ᠠ︐ ᡆ︐ ᡈ) it should not change the ending. for a it makes the normal a ending come a little later and o and ü don't have endings.

** @todbichig Please clarify your question with a text sample and visuals.

kmansourMT commented 5 years ago

The sixth bug is related specific when these three characters follow each other ᡎ ᠯ ᠠ (g + l + anything) (U+184E followed by U+182F followed by another letter). The first letter should change its circle as already documented above but it shouldn't change form completely when adding another letter to the string.

@todbichig When it is in initial position in the specified context, does Todo Ga (u184E) just lose its circular mark, or does it also gain a "hook mark" like the medial form?

lianghai commented 5 years ago

@kmansourMT, @todbichig:

I do urge you to look into L2/19-130 Towards a well-formed Mongolian specification that allows interoperable implementations and contribute your understanding of the Todo writing system (as well as other writing systems that use the Mongolian script) to it. This document is an initial draft of a Unicode Technical Note requested by the UTC, and is the current platform for collaboration between experts for achieving an agreement on the Mongolian script’s encoding and rendering behavior.

The fifth bug is with regards to ᡃ (Mongolian letter Todo long vowel sign) (U+1843). It follows the vowels a, o, or ü (ᠠ︐ ᡆ︐ ᡈ) it should not change the ending. for a it makes the normal a ending come a little later and o and ü don't have endings.

Please clarify your question with a text sample and visuals.

The encoding and rendering of U+1843 MONGOLIAN LETTER TODO LONG VOWEL SIGN is problematic at the standardization level. It’s largely supposed to be an Mn (Nonspacing_Mark) instead of Lm (Modifier_Letter), but its first form (final) (note an Mn would not have positional forms) as currently defined in the names list also appears to be attested and might need a differentiated encoding.

As far as @todbichig is concerned though, the string “ᠨᠠᡃ ᠨᡆᡃ ᠨᡈᡃ” should be rendered like what Menk Qagan Tig has in the screenshot below:

PreviewImages

The sixth bug is related specific when these three characters follow each other ᡎ ᠯ ᠠ (g + l + anything) (U+184E followed by U+182F followed by another letter). The first letter should change its circle as already documented above but it shouldn't change form completely when adding another letter to the string.

When it is in initial position in the specified context, does Todo Ga (u184E) just lose its circular mark, or does it also gain a "hook mark" like the medial form?

Note that for consonant letters that are affected by their syllabic roles, it’s important to differentiate the mechanism of syllabic forms (.stray, .onset, and .coda) from the mechanism of cursive joining forms. These two mechanisms are a pair of orthogonal dimensions. See Syllabic variations on page 14 of the aforementioned L2/19-130 document.

Also U+184E MONGOLIAN LETTER TODO GA has a third orthogonal dimensions, gender forms (.masculine and .feminine, while the .feminine form is not affected by its syllabic role).

Altogether there are at least the following attested forms:

Here what @todbichig was talking is what form a G.init.feminine/masculine.stray takes. It’ll be helpful if @todbichig can simply provide a screenshot or refer to the names list, but it will not lead to a complete shaping spec unless we can analyze this case coherently.

Note that this is not a random special case for the .coda form (what you guys loosely call “the hook mark form”), but is about a specific condition when the consonant is not within a typical C?V+C? syllabic structure. For example, for Hudum, the .stray forms appear to align with .feminine forms.

lianghai commented 5 years ago

Btw, there’s a huge spreadsheet prepared by another Todo user, Mingzai: MWG/3-N15.

Despite the daunting enumeration, it’s neither complete nor coherent though. (It’s in my plan to help him simplify the representation and complete the content.) Therefore I do not recommend it to readers who do not already understand the Todo orthography to some extent.

kmansourMT commented 5 years ago

Many thanks!

From: 梁海 Liang Hai notifications@github.com Reply-To: googlefonts/noto-fonts reply@reply.github.com Date: Tuesday, 13 August 2019 at 05:52 To: googlefonts/noto-fonts noto-fonts@noreply.github.com Cc: "Mansour, Kamal" Kamal.Mansour@monotype.com, Mention mention@noreply.github.com Subject: Re: [googlefonts/noto-fonts] Bugs with displaying Todo Mongolian correctly in NotoSansMongolian (#1190)

@kmansourMThttps://github.com/kmansourMT, @todbichighttps://github.com/todbichig:

I do urge you to look into L2/19-130 Towards a well-formed Mongolian specification that allows interoperable implementationshttps://www.unicode.org/L2/L2019/19130-mwg3-8-mong-spec-r.pdf and contribute your understanding of the Todo writing system (as well as other writing systems that use the Mongolian script) to it. This document is an initial draft of a Unicode Technical Note requested by the UTC, and is the current platform for collaboration between experts for achieving an agreement on the Mongolian script’s encoding and rendering behavior.

The fifth bug is with regards to ᡃ (Mongolian letter Todo long vowel sign) (U+1843). It follows the vowels a, o, or ü (ᠠ︐ ᡆ︐ ᡈ) it should not change the ending. for a it makes the normal a ending come a little later and o and ü don't have endings.

Please clarify your question with a text sample and visuals.

The encoding and rendering of U+1843 MONGOLIAN LETTER TODO LONG VOWEL SIGN is problematic at the standardization level. It’s largely supposed to be an Mn (Nonspacing_Mark) instead of Lm (Modifier_Letter), but its first form (final) (note an Mn would not have positional forms) as currently defined in the names list also appears to be attested and might need a differentiated encoding.

As far as @todbichighttps://github.com/todbichig is concerned though, the string “ᠨᠠᡃ ᠨᡆᡃ ᠨᡈᡃ” should be rendered like what Menk Qagan Tig has in the screenshot below:

[PreviewImages]https://user-images.githubusercontent.com/343259/62940709-bc50bd80-be06-11e9-8015-31b5eea58966.png

The sixth bug is related specific when these three characters follow each other ᡎ ᠯ ᠠ (g + l + anything) (U+184E followed by U+182F followed by another letter). The first letter should change its circle as already documented above but it shouldn't change form completely when adding another letter to the string.

When it is in initial position in the specified context, does Todo Ga (u184E) just lose its circular mark, or does it also gain a "hook mark" like the medial form?

Note that for consonant letters that are affected by their syllabic roles, it’s important to differentiate the mechanism of syllabic forms (.stray, .onset, and .coda) from the mechanism of cursive joining forms. These two mechanisms are a pair of orthogonal dimensions. See Syllabic variations on page 14 of the aforementioned L2/19-130 document.

Also U+184E MONGOLIAN LETTER TODO GA has a third orthogonal dimensions, gender forms (.masculine and .feminine, while the .feminine form is not affected by its syllabic role).

Altogether there are at least the following attested forms:

Here what @todbichighttps://github.com/todbichig was talking is what form a G.init.feminine/masculine.stray takes. It’ll be helpful if @todbichighttps://github.com/todbichig can simply provide a screenshot or refer to the names list, but it will not lead to a complete shaping spec unless we can analyze this case coherently.

Note that this is not a random special case for the .coda form (what you guys loosely call “the hook mark form”), but is about a specific condition when the consonant is not within a typical CV+C syllabic structure. For example, for Hudum, the .stray forms appear to align with .feminine forms.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/googlefonts/noto-fonts/issues/1190?email_source=notifications&email_token=ADBF5BLTAQ6S6Z6DWJJLRKLQEKVATA5CNFSM4FA3LENKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4FRXEI#issuecomment-520821649, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ADBF5BIGR6DD76T5HRH7RQTQEKVATANCNFSM4FA3LENA.

marekjez86 commented 5 years ago

fixed in https://github.com/googlefonts/noto-fonts/tree/master/phaseIII_only/unhinted/ttf/NotoSansMongolian