Invalid GSUB rules in Noto Sans Javanese

GoogleCodeExporter commented 9 years ago

The GSUB table for Noto Sans Javanese contains several rules that
- assume character sequences that are invalid according to the Unicode Standard 
and to the OpenType specification for Javanese,
- assume glyph sequences that can't occur when GSUB rules are processed because 
of prior OpenType reordering, or
- assume glyph sequences that can't occur because of earlier GSUB rules.

Specification references:
http://www.unicode.org/versions/Unicode7.0.0/ch17.pdf
http://www.microsoft.com/typography/OpenTypeDev/javanese/intro.htm

1) Lookup 1 defines a ligature for the sequence uniA9B8 uniA9BF. U+A9B8 is a 
vowel, U+A9BF a medial consonant, and according to the specifications vowels 
always follow medial consonants in well-formed syllables. The ligature is 
therefore not formed for well-formed text.

2) Lookup 2 defines an alternate glyph for the pasangan form of uniA9B0 if it 
precedes one of the vowels uniA9BA or uniA9BB. These sequences are valid at the 
character level, but cannot occur at the stage where GSUB rules are processed 
because a shaper implementing the OpenType Javanese specification reorders 
these vowels to the beginning of the syllable.

Note: Lookup 4 similarly defines ligatures for the pasangan forms of uniA9A5, 
uniA9A6, uniA9B1, and uniA9B2 if they precede one of the vowels uniA9BA or 
uniA9BB; however, unlike Lookup 2 it is used only for the DFLT writing system, 
when Javanese-specific reordering rules aren't applied.

3) Lookup 3 defines a ligature to replace either uniA981 uniA9BC or uniA9BC 
uniA981. Only uniA9BC uniA981 is a valid sequence according to the 
specifications.

4) Lookup 6 includes glyph00119, which is a ligature for uniA994 and uniA9B8, 
as context for uniA9B8. A sequence of two U+A9B8 is not valid according to the 
Unicode standard (although the OpenType specification strangely allows it).

5) Lookup 11 defines ligatures for the pasangan forms of uniA994, uniA997, 
uniA99B, uniA99D, uniA9AE followed by uniA9B8; however, Lookup 6 already maps 
uniA9B8 to a below-pasangan form when following these pasangan forms, so the 
ligature rule can't match uniA9B8.

Noto Sans Javanese 1.01

Original issue reported on code.google.com by googled...@lindenbergsoftware.com on 20 Jan 2015 at 11:38

GoogleCodeExporter commented 9 years ago

Original comment by stua...@google.com on 28 Feb 2015 at 12:31

GoogleCodeExporter commented 9 years ago

Original comment by roozbeh@google.com on 2 Apr 2015 at 4:02

behdad commented 9 years ago

cc @fontguy @roozbehp

bennylin commented 9 years ago

Please mark this as Script-Javanese. Thanks!

jungshik commented 9 years ago

@kmansourMT

kmansourMT commented 8 years ago

Sub-issue 1: uniA9B8 (vowel u), uniA9BF (medial Ra)

Whenever a consonant is followed by a medial Ra and a vowel, the Ra must precede the vowel. The previous version of the font allowed the sequence of ‘vowel u’ followed by ‘medial Ra’, as in Ka (A98F) + vowel u + medial Ra:

ka u medra

In the new version of the font, this same sequence is not accommodated, and results in ka u medra-new

kmansourMT commented 8 years ago

Sub-issue 2: Under the Javanese-script tag, the order of sequences of consonant + e/ai-vowel should not be reversed because a Javanese-capable shaper should have already carried out this step. The following demonstrates the incorrect behavior; the two glyphs on the left show Ssa+vowel_e as an input sequence, while the resulting output is shown after the vertical bar:

ssa vwl_e

In the new version of the font, this substitution code has been removed.

kmansourMT commented 8 years ago

Sub-issue-3: In a sequence of marks, the vowel Ae (A9BC) should precede secondary signs such as the anusvara (A981); however, the reverse sequence is also accepted by the font, as is evident in the following:

ae-dot-bad

In the new version of the font, the results are ae-d0t-good

kmansourMT commented 8 years ago

Sub-issue 4: Lookup 6 includes the sequence of uniA994 and uniA9B8 (u vowel) as a preceding context for uniA9B8 (u vowel). A sequence of two U+A9B8 (u vowel) is not valid according to the Unicode standard.

The code has now been corrected by removing uniA9B8 (u vowel) from the preceding context.

kmansourMT commented 8 years ago

Sub-issue 5: Lookup 11 defines ligatures for the pasangan forms of uniA994, uniA997, uniA99B, uniA99D, uniA9AE followed by uniA9B8 (u vowel); however, Lookup 6 already maps uniA9B8 to a below-pasangan form when following these pasangan forms, so the ligature rule can't match uniA9B8.

The code has now been corrected by referring to the pasangan variant of uniA9B8 (u vowel)

marekjez86 commented 8 years ago

@jungshik is there a way to ask the original submitter to verify this (assuming we could give them/him/her the font file)? ... or is it up to us to verify it?

dougfelt commented 8 years ago

@NorbertLindenberg looks like the original submitter. Not sure if he gets github email.

dougfelt commented 8 years ago

I created some samples, one for each issue, and rendered them with hb-view. hb-view does script analysis, and I did not try running any default rules. The rendered samples seem ok to me.

issue 1: javanese_ka_suku_cakra_hbv issue 2: javanese_sa-mahaprana_taling_hbv

issue 3: javanese_pepet_order_hbv issue 4 (note that the second, illegal vowel just disappears completely in the example on the right): javanese_uu_hbv issue 5: javanese_passangan_u_hbv

dougfelt commented 8 years ago

I'm going to close this. Perhaps Norbert will reopen it if he finds issues.

jungshik commented 8 years ago

@dougfelt said he'd close this one, but apparently forgot the right button. I'm closing.

notofonts / javanese

Invalid GSUB rules in Noto Sans Javanese #14