notofonts / syloti-nagri

Noto Syloti Nagri
SIL Open Font License 1.1
1 stars 1 forks source link

Syloti conjuncts #1

Open Sagir8453 opened 6 years ago

Sagir8453 commented 6 years ago

Defect Report

Hi, Sylheti conjuncts aren't visible, conjuncts can't be created

Title

Sylheti conjunct consonants

Font

NotoSansSyloti-Regular.ttf.

Where the font came from, and when

Site: https://noto-website-2.storage.googleapis.com/pkgs/NotoSansSyloti-hinted.zip

Date: 2017-07-11

OS name and version

Android 6

Application name and version

Only works on OpenOffice.

Issue

Conjunct consonants don't form.

marekjez86 commented 6 years ago

@Sagir8453 : could you write some text that could serve as a test for this?

brawer commented 6 years ago

@Sagir8453, do you know anyone who could translate the Universal Declaration of Human Rights from Bangla, English, Assamese, or some other existing translation to the Sylheti language written in the Sylheti Nagari script? (Sylheti in other scripts, eg. Bangla, would also be useful). At Google, we use this text a lot for testing, and so do other companies. So, if you could contribute a translation of the Human Rights Declaration to Unicode, it would be very helpful.

Sagir8453 commented 6 years ago

@marekjez86 , @brawer Universal declension of Human rights: ꠢꠇꠟ ꠝꠣꠘꠥ ꠀꠎꠣꠖ ꠅꠁꠀ ꠢꠝꠣꠘ ꠁꠎ꠆ꠎꠔ ꠀꠞ ꠢꠇ ꠟꠁꠀ ꠙꠄꠖꠣ ꠅꠄ। ꠔꠣꠞꠣꠞ ꠛꠤꠛꠦꠇ ꠀꠞ ꠀꠇꠟ ꠀꠍꠦ। ꠄꠞ ꠟꠣꠉꠤ ꠢꠇꠟꠦ ꠄꠇꠎꠘꠦ ꠀꠞꠇꠎꠘꠞ ꠟꠉꠦ ꠛꠤꠞꠣꠖꠞꠤꠞ ꠝꠘꠥꠜꠣꠛ ꠟꠁꠀ ꠀꠌꠞꠘ ꠇꠞꠔꠦ ꠟꠣꠉꠦ।

This has only two conjuncts, so some I'm adding the conjuncts which are present in Noto sans Sylheti font but not supported: ꠇ꠆ꠇ, ꠈ꠆ꠔ, ꠌ꠆ꠌ, ꠌ꠆ꠍ, ꠎ꠆ꠎ, ꠔ꠆ꠔ, ꠘ꠆ꠔ, ꠘ꠆ꠖ, ꠘ꠆ꠘ, ꠛ꠆ꠛ, ꠝ꠆ꠛ, ꠝ꠆ꠝ, ꠞ꠆ꠟ, ꠡ꠆ꠇ, ꠡ꠆ꠛ, ꠡ꠆ꠍ, ꠡ꠆ꠔ, ꠡ꠆ꠛ, ꠟ꠆ꠟ

brawer commented 6 years ago

Thank you! Would you be able to translate the entire text? It’s admittedly some work, but it would be quite helpful.

Sagir8453 commented 6 years ago

@brawer

That's the 1st article of Universal declension of Human rights. or Did you mean the pronunciations of the conjuncts?

Sagir8453 commented 6 years ago

@brawer

That's the 1st article of Universal declension of Human rights. or Did you mean the pronunciations of the conjuncts?

brawer commented 6 years ago

The preamble and the other 29 articles. With those, the Syloti translation would be in the same state as those for all other languages. See https://github.com/unicode-org/udhr/issues/12 — @Sagir8453, if you could help with that, it would be much appreciated; you’re the first and only Syloti speaker who’s contacted us so far.

spsmith57 commented 6 years ago

I too noticed that conjuncts do not form. Looking at the font, the main issue is that the conjuncts are being treated in a similar way to Latin-style ligatures, where a sequence of two or more characters are ligated when they occur next to each other (e.g., 'f' followed by 'i' forms the 'fi' ligature). But Indic conjuncting is not the same as ligation, and OpenType provides the cjct feature for this purpose, separate from the ligature-related features (liga, dlig, etc).

Consonant characters in Syloti Nagri incorporate an inherent vowel. Conjuncts represent a consonant cluster, ie, the first consonant in the sequence has the inherent vowel removed. This is done by inserting the hasant/halant/virama (U+A806) after it. Thus, the sequence ko + ko (U+A807 U+A807) represents the two syllables 'koko' and should never ligate. To get the cluster 'kko', one would enter the three characters U+A807 U+A806 U+A807, as in the bottom line of the post by Sagir8453 on 11 Nov 2017. If the font contains a conjunct for that particular cluster it will be displayed; otherwise the hasant itself is displayed as a circumflex (preferably overlapping both characters) to indicate the absence of the inherent vowel.

I would suggest that the lookups in the font be changed so that they are triggered by C1 Hasant C2 rather than simply C1 C2, and that they be associated with a lookup which is on by default (such as cjct, or possibly ccmp).

An example of a Syloti Nagri font which displays the conjuncts is the Surma font, which can be found at https://github.com/syltrans/surma . The conjuncts in that font display correctly on Windows 10 in applications that use Microsoft's Universal Shaping Engine. It also works with SIL's Graphite renderer. Conjuncts do not display with Harfbuzz, which leads me to suspect that in addition to the issues noted above with Noto, Harfbuzz itself may need updating, similar to Microsoft's USE.

dscorbett commented 6 years ago

U+A806 SYLOTI NAGRI SIGN HASANTA currently has Indic_Syllabic_Category=Pure_Killer, meaning it is not meant to form conjuncts. The Unicode Standard is characteristically vague: although it describes Syloti Nagri’s atypical ligatures, it does not explain how to encode them. Cf. L2/05-130, which goes into more detail but may not be what Unicode finally decided on. Cf. also L2/17-418, which proposes a new model for Syloti Nagri conjuncts.

I suggest not making any changes till Unicode clarifies the encoding model.

LornaSIL commented 3 years ago

As of Unicode 13.0 there is a new character U+A82C SYLOTI NAGRI SIGN ALTERNATE HASANTA which needs adding to the font. The properties for U+A806 and U+A82C are below: A806 ; Virama # Mn SYLOTI NAGRI SIGN HASANTA A82C ; Pure_Killer # Mn SYLOTI NAGRI SIGN ALTERNATE HASANTA

Is it possible to update the font to support the new character to get the correct conjunct behavior? TUS 15.1 also discusses it: http://www.unicode.org/versions/latest/ch15.pdf

Is this sufficient information?

marekjez86 commented 3 years ago

@LornaSIL : thank you for the update... Yes, we'll deal with it, but I'm not certain when (I need to figure out how it fits with other updates)

LornaSIL commented 3 years ago

@marekjez86 Great! Unless you already have some working on the UDHR translation, we can work on getting someone in the language community to do that. Is your preference that it be submitted it to unicode.org or somewhere else?

simoncozens commented 1 year ago

I think this is working now apart from two conjuncts in @Sagir8453 's list: ꠈ꠆ꠔ and ꠟ꠆ꠟ. I don't believe these are present in the font, so would need drawing. The alternate hasanta U+A28C is supported in the font now.