notofonts / nototools

Noto fonts support tools and scripts plus web site generation
Apache License 2.0
270 stars 90 forks source link

Invalid use of apostrophe character in sample texts. #146

Closed lemzwerg closed 8 years ago

lemzwerg commented 8 years ago

The following files use an apostrophe instead of the correct, language-specific character, as far as I can tell.

el-Grek-monoton_udhr.txt: 'O -> Ό (U+038C, GREEK CAPITAL LETTER OMICRON WITH TONOS)
el-Grek_udhr.txt: 'O -> Ό
mai-Deva_udhr.txt: ' -> ऺ (U+093A, DEVANAGARI VOWEL SIGN OE)
dougfelt commented 8 years ago

I suspect the Maithili text is correct in using an apostrophe, but it might preferably be 02bc (modifier letter apostrophe). I'm trying to find out. http://www.unicode.org/L2/L2008/08197--bodo-dogri-maithili.pdf calls this character 'Latin Apostrophe' so it's a bit ambiguous.

I will make the suggested change to the Greek, and wait to hear if anyone can provide a definitive answer about the Maithili.

moyogo commented 8 years ago

In that proposal “Latin apostrophe” should be “modifier letter apostrophe” since it’s not a punctuation sign but a modifier letter (or diacritic in the broad sense of the word). See TDIL-CDAC’s Script grammar for Maithili language for another reference attesting the use of U+02BC in Maithili.

I’m wonder why the apostrophe only occurs in the word आ' in http://www.unicode.org/udhr/d/udhr_mai.txt. This doesn’t seem to correspond with the uses described in the two documents mentioned.

dougfelt commented 8 years ago

I'll keep this open for a few days or until the apostrophe issue is resolved.

Thanks for the link to the CDAC pdf,@moyogo. The other paper also referred to these uses and I also wondered about this since the use in the udhr text didn't seem to match.

dougfelt commented 8 years ago

According to Anshuman Pandey, 02BC is preferred to apostrophe in the Mailthili sample, so I will update it.