notofonts / latin-greek-cyrillic

Noto Latin, Greek, Cyrillic
SIL Open Font License 1.1
42 stars 8 forks source link

Kpelle problems - Full support for latin characters? #169

Closed kmuncie closed 1 year ago

kmuncie commented 7 years ago

Trying to understand what exactly is included in Noto Sans Regular seems hard to answer, is there a comprehensive list of every unicode character covered by each font and variant?

I am trying to understand why the website implies that Kpelle is supported but it seems to have major holes in character support. To focus on one character as an example, though it is missing many:

This is the 'latin small letter open e' character but does not display when using Noto Sans Regular. I see this problem either on the Google Fonts example page or the website I have embedded the released Noto Sans Regular font on. Am I missing something obvious here or is support actually missing for large chunks of latin characters? Thank you for your time

google_noto_fonts notosans-kpelle

marekjez86 commented 7 years ago

@kmuncie : We are working on including in Noto fonts Unicode support up to Unicode 9. For the work in progress see the fonts at https://github.com/googlei18n/noto-fonts/tree/master/unhinted

I have ran the following as input to fontdiff

<html lang="en">
    <p><span style="font-weight:700">“03B5 ε”</span></p>
    <p><span style="font-weight:700">“025B ɛ”</span></p>
    <p><span style="font-weight:700">“Kpɛlɛɛ”</span></p>
</html>

and generated the following PDFs. They look fine to me. Please advise if you think it's OK with these new fonts (see NotoSans-.ttf, NotoSerif-.ttf, etc in the link above). I worried about your statement "... though it is missing many ...". If it doesn't handle something please reopen this issue. Thank you for bringing this up to our attention.

en-808-NotoSans-Black.pdf en-808-NotoSans-Italic.pdf en-808-NotoSansDisplay-Black.pdf en-808-NotoSansDisplay-Italic.pdf en-808-NotoSerif-Black.pdf en-808-NotoSerif-Italic.pdf en-808-NotoSerifDisplay-Black.pdf en-808-NotoSerifDisplay-Italic.pdf

dougfelt commented 7 years ago

Um, well, sigh.

Unfortunately, the fonts that google web fonts provides and the fonts that we provide are not the same. There's a number of issues. One is that their versions of the Noto fonts (those they do provide) are rather old. Another is that the font subsets they serve don't cover the full character repertoire of some of the fonts, most notably NotoSans. Either or both of those issues might be contributing to this problem. You might file a bug against them, bugs from external users of the API can sometimes get more attention.

Noto Sans contains the following codepoints, fyi: 0000 000d 0020-007e 00a0-036f 0374-0375 037a-037e 0384-038a 038c 038e-03a1 03a3-03ce 03d0-0527 1d00-1dca 1dfe-1e9b 1e9e 1ea0-1ef9 1f00-1f15 1f18-1f1d 1f20-1f45 1f48-1f4d 1f50-1f57 1f59 1f5b 1f5d 1f5f-1f7d 1f80-1fb4 1fb6-1fc4 1fc6-1fd3 1fd6-1fdb 1fdd-1fef 1ff2-1ff4 1ff6-1ffe 2000-200f 2012-2022 2026 202a-2030 2032-2034 2039-203a 203c 203e 2044 205e 206a-2070 2074-2079 207f 2090-2094 20a0-20a9 20ab-20b5 20b9-20ba 20f0 2105 2113 2116-2117 2122 2126 212e 214d-214e 2153-2154 215b-215e 2184 2190-2195 21a8 2202 2206 220f 2211-2212 2215 2219-221a 221e-221f 2229 222b 2248 2260-2261 2264-2265 2302 2310 2320-2321 2500 2502 250c 2510 2514 2518 251c 2524 252c 2534 253c 2550-256c 2580 2584 2588 258c 2590-2593 25a0-25a1 25aa-25ac 25b2 25ba 25bc 25c4 25ca-25cc 25cf 25d8-25d9 25e6 263a-263c 2640 2642 2660 2663 2665-2666 266a-266b 266f 29f5 2c60-2c6d 2c71-2c77 2e17 a717-a721 a788-a78c fb01-fb04 fe20-fe23 feff fffc-fffd

Our descriptions of supported languages are somewhat, shall we say, "aspirational". If the font handles the CLDR exemplar character set (for Kpelle this is 'seed' data and not fully vetted) and the sample text (In most cases, when present it is a translation of Article 1 of the Universal Declaration of Human Rights from the data set at unicode.org/udhr), we claim that we support the language. In actuality though our support for most of these languages has not been vetted by a native speaker or expert, so particular glyph variants used by a language, or special forms used for various sequences, might not be present.

In those cases we rely on bug reports from knowledgeable users. I expect Kpelle has not been vetted, so if you do decide to install NotoSans and try rendering some text locally with it (not using web fonts, which we know will fail based on the unicode-ranges google web fonts supports for Noto Sans), if something fails to render properly we'd like to hear about it.

dougfelt commented 7 years ago

@jungshik fyi.

andjc commented 7 years ago

Fonts served by GFD will not support Kpelle, since key characters fall outside the repertoire it supports.

Also are we discussing Kpelle used in Liberia or Guinea? Since like many other languages used in multiple countries in West Africa have different orthographies in each country, and as a consequence different alphabets.

Looking at codepoints to determine if a language is supported is insufficient, it is necessary to look at the glyphs needed for that language. The glyph for Eng is a case in point. The appropriate glyph in Noto Sans is probably Eng.alt1 rather than Eng. So it becomes a question of how that glyph is exposed in the font. From memory Noto sans family uses aalt feature to expose the glyph.

This is useful for applications such as InDesign that provide glyph pickers. But not suitable for web browsers.

Same issue in notofonts/latin-greek-cyrillic#162.

But it is likely that a other characters may also require alternative glyphs.

brawer commented 7 years ago

Regarding glyph alternates (different letterforms than today), it would be great to know what needs to be changed. Can you tell specifics? As in: Which glyphs look wrong, and how should they look instead? Thanks in advance, your help with finding this difficult-to-research information would be much appreciated.

Regarding the covered characters (missing codepoints), I’m confused by this bug. The current Noto fonts cover Unicode 6, and Unicode 9 is work in progress; so what’s missing? According to the data in Unicode CLDR, the following codepoints are needed for Kpelle, and these are all covered by Noto. (The digraphs in {braces} don’t matter for font rendering, so we can ignore them here).

moyogo commented 7 years ago

@brawer With African languages, the general assumption is that the n-form Ŋ is the preferred form, as it is the form used in several reference documents. The N-form Ŋ is the preferred form in Sami languages. There are also American languages and Australian languages that use the letter, the former group seem to prefer the n-form and the latter group the N-form. Some of these general assumptions might be wrong (it may not be true for every language of the mentioned categories) or overstated (some users may not care — I know publishers who don’t care in one country while publishers in the same language in another country do —, or historically the nuance was blurry) but in general experts are happier if followed. There are more languages (and more speakers) that seem to prefer the N-form, so it may make more sense to have that as the default form. There can also be preferences for a n-form on the baseline instead of with a descender or n-form on the baseline with raised left stem — however these might better match font style than locale preference. It is also better if Ɲ matches Ŋ with n-form in those languages that prefer n-form Ŋ.

For Kpelle Liberia or other Liberian languages (Dan and Gio, but not in Côte d’Ivoire), Ɓ has also been documented with a Ƃ-form (see http://www.unicode.org/wg2/docs/n3481.pdf). This was also the historical form in the 1928 Africa Alphabet.

moyogo commented 7 years ago

For Guinean languages, it may make more sense to have g and ɠ have the same basic form.

andjc commented 7 years ago

@brawer As @moyogo indicates there is a range of characters used in Africa that differ from the reference glyphs in the Unicode chart. It is necessary to look at each language and determine it's typographic norms.

brawer commented 7 years ago

Here’s a PDF file with the Universal Declaration of Human Rights in Guinea Kpelle, rendered with the current draft for the next version of Noto Sans. The PDF contains the same text in normal casing, in small caps, and in uppercase. Can you check each of these three, and tell whether the letterforms are correct for Guinea Kpelle, and if not, how exactly things should look different?

From this bug, it sounds that for Guinea Kpelle, we should change the shape of Ŋ from the default Eng to the glyph Eng.alt1, but it might also be Eng.alt3instead; does anyone know? See https://github.com/googlei18n/noto-fonts/issues/911 for pictures.

I’d love to make a similar rendering for Liberia Kpelle, but Unicode currently doesn’t have a translation. If you know anyone who could make such a translation, it would be much appreciated (just attach it to this bug and I’ll get it added to the Unicode collection).

andjc commented 7 years ago

@brawer, unlikely to be Eng.alt3, more likely to be Eng.alt1 or the Eng variant not supported by Noto Sans, see comment in notofonts/latin-greek-cyrillic#161. Will check sources I have access to.

I need to double check which Ɲ glyph should be used, ie N-form or n-form.

brawer commented 7 years ago

Smallcaps looks very uneven; see https://github.com/googlei18n/noto-fonts/issues/913.

andjc commented 7 years ago

@Brawer, I suspect what is happening is that it is a mix of small caps and minuscules ... it looks like the extended latin characters were not converted to small caps. But I may be wrong in that asusmption.

On 8 May 2017 at 15:34, Sascha Brawer notifications@github.com wrote:

Smallcaps looks very uneven; see notofonts/latin-greek-cyrillic#160 https://github.com/googlei18n/noto-fonts/issues/913.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/googlei18n/noto-fonts/issues/808#issuecomment-299776854, or mute the thread https://github.com/notifications/unsubscribe-auth/ABZme8CPjRveZzcyVmR2LUtwg3Ckvb4Gks5r3qlZgaJpZM4LMMoP .

-- Andrew Cunningham lang.support@gmail.com

andjc commented 7 years ago

@Brawer, I suspect what is happening is that it is a mix of small caps and minuscules ... it looks like the extended latin characters were not converted to small caps. But I may be wrong in that assumption.

andjc commented 7 years ago

@Brawer, relooking through thread @moyogo discusses the preferred variants for Ŋ and Ɲ above.

moyogo commented 7 years ago

For n-form Ɲ, see for example Paamanta Demmba Abubakaar, ''SIDA : kelu-cuudi!'', Bamako : Éditions Le Figuier, 2002, http://dlir.aiys.org/ALMA/alma_ebooks/pulaar_013.pdf

page 4: screen shot 2017-05-08 at 7 41 45 am

andjc commented 7 years ago

And in http://www.dlir.org/docs/alma_ebooks/Pulaar_015.pdf, page 18.

screen shot 2017-05-08 at 17 20 01

there are possibly two variants of n-form, as there is with Eng. Although they may just be stylistic choices.

marekjez86 commented 7 years ago

Here are PDF files with the Universal Declaration of Human Rights in Guinea Kpelle, rendered with the current draft for the next version of Noto {Sans,SansMono,SansDisplay,Serif,SerifDisplay} {Regular,Italic} -- all possible Noto fonts. More languages are at https://github.com/googlei18n/noto-fonts-alpha/tree/master/udhr-test/basic-width-weights

gkp_udhr_Sans_Italic.pdf gkp_udhr_Sans_Regular.pdf gkp_udhr_SansDisplay_Italic.pdf gkp_udhr_SansDisplay_Regular.pdf gkp_udhr_SansMono_Regular.pdf gkp_udhr_Serif_Italic.pdf gkp_udhr_Serif_Regular.pdf gkp_udhr_SerifDisplay_Italic.pdf gkp_udhr_SerifDisplay_Regular.pdf

simoncozens commented 1 year ago

To summarise where we are with this issue, to the best of my understanding:

If that's correct, I don't think there's anything unique about this issue and we can track the other issues linked above.