Open vv-monsalve opened 4 weeks ago
@moyogo, could you please take a look at this list and help us determine whether any of them should be listed in the Latin African glyphs? E.g. Mhook
, rfishhook
.
Hi Viviana, you are asking very hard questions :)
Your list features a mix of things and I cannot easily know what languages use which glyphs. Your list does feature a mix of things spaning from characters present in Latin Extended A which we suggest to support almost fully, Pinyin, characters that we identified as used in African languages back in February this year (we submitted all of the documentation back then and later on directly to you), and a few glyphs I am personally not familiar with.
I will try to reply to your specific questions and add any notes we can. And of course I am happy to remove anything you do not want in the font.
Under the strategy of using a single source file to create different script families, a subsetting process is essential. This requires us to specify which elements will be included in each family.
I began this process by utilizing our Glyphset definitions, as they outline our font requirements. However, I noticed that many glyphs were omitted when compared with the previous fonts.
This review's purpose is to gain a clear understanding of the what is in the source/font and the reasons behind those choices. This way, we can make informed and strategic decisions about what to include in the final output fonts.
did not manage to find answers for them all but I think you will find some of the information you need here: Lista viviana.numbers.zip
Thank you for the information. I've added it to the table above as well as additional info from our languages database.
characters present in Latin Extended A
We have detached from the Unicode blocks and moved to a language-centered approach to define our Glyphsets.
Pinyin, characters that we identified as used in African languages back in February this year (we submitted all of the documentation back then and later on directly to you)
When we discussed revising the Glyphsets, I mentioned that the Cyrillic Plus set needed to be reviewed and redefined. Therefore, the information you provided would be taken into account for that task (which we did). However, the Latin African glyph set is considered defined after @moyogo extensive work over the past year and early this year. Nevertheless, I'm also asking him to review the list and provide feedback.
I double checked some of these since the language data added to gflanguages was much smaller than what I would have liked to cover, so there’s likely gaps for less common cases.
In short, I’d recommend keeping:
Here are long comments with short comments in the table at the end.
While /Abreve/abreve is used in Tuareg languages (Tamashek [tmh] and Tamasheq [taq_Latn] or Tawallammat Tamajaq [ttq_Latn]), the other vowels with breve /Ebreve/ebreve/Ibreve/ibreve/Obreve/obreve/Ubreve/ubreve are not used in current African orthographies. They have been used in really old African orthographies but those are out of scope. /Ubreve/ubreve is used in Esperanto. /Abreve/abreve/Ebreve/ebreve/Ibreve/ibreve/Obreve/obreve/Ubreve/ubreve are used in Jarai in Vietnam.
/firsttonechinesemay may be used to illustrate the use of /macroncomb, for example "ˉ is used to indicate middle tone on vowels".
/lowlinecomb may be used instead of /macronbelowcomb in particular when practical orthographies use underlining, like in Gabon.
/commaturnedmod is used in Hawaiian and other Austronesian languages like Tongan, Wallisian and sometimes in Tahitian. It can also be used in Uzbek. Semitic languages transliteration systems use it as well, for examples transliteration of Arabic or Ge’ez (Gəʻəz). For Hawaiian, it occurs in toponyms in English in Hawaii, for example on Apple Maps or Google Maps, and several documents. It may be used in Gawwada or other languages of the Dullay language cluster in Ethiopia (alternatively ʕ may be used instead), but it’s not clear if that is in a practical orthography or only in linguistic works, futher investigation is needed. Design-wise, it’s a turned /apostrophemod, so it’s a low hanging fruit.
/Ezhcaron/ezhcaron is used in Skolt Sami mostly in Finland. It is used in Laz, for example in the orthography used in school manuals in Turkey.
/Gcircumflex/gcircumflex is used in Esperanto. It is also used in Aleut in the USA. In Guinea, Ĝ ĝ is sometimes used as an adhoc substitute for Ɠ ɠ /Ghook/ghook (the same is true for b̂, d̂, ĥ and ɓ, ɗ, ɦ, or others).
/hmod is used in several North American orthographies, or phonetic transcriptions like IPA.
/uniA7AE/Ismall is used in one of the Koulango orthographies in Côte d’Ivoire.
Ĵ ĵ /Jcircumflex/jcircumflex is used in Esperanto.
ĸ /kgreenlandic was used in Kalaallisut (Greenlandic) in the old orthography and was replaced by q in the 1973 orthography. It’s not clear if Inuttitut still uses it as K is used instead in practice or in documents using the 1980 orthography, however the Labrador Inuttitut Heritage Bible uses ᴋ /Ksmall in the digitized version.
Ɬ ɬ /Lbelt/lbelt is used in several North American orthographies, sometimes as an alternate of /Lslash/lslash or Ƚ ƚ /Lbar/lbar or Ɫ ɫ /Lmiddletilde/lmiddletilde.
/Ecedillabreve/ecedillabreve is used in the ISO 259 transliteration of Hebrew and may appear in library catalog or academic works. ISO 259 also uses /abreve/obreve, other transliteration systems of Hebrew may use those and other vowels with breves.
Ɱ ɱ /Mhook/mhook are not used in African orthographies as far as we know. ɱ is used in various phonetic transcriptions including IPA, Ɱ is used in some Americanist phonetic transcription system for a voiceless equivalent. Until we find attestations of use, it can be removed and kept for a phonetic glyphset.
/Otildedieresis/otildedieresis isn’t used in orthographies as far as I know. It seems to be used in some academic phonetic transcriptions. If tilde and dieresis are used together in orthographies, they’d like be in a different order ö̃ instead.
ɾ /rfishhook is used in the IPA, not in orthographies.
/Rmacronbelow/rmacronbelow is used in Pitjantjatjara in Australia and in Mitla Zapotec in Mexico, and in some transliteration system of Tamil.
/Udieresiscaron/udieresiscaron and /Udieresisgrave/udieresisgrave are used in Hanyu Pinyin transcription of Chinese, and can be used in Southern Tutchone in Canada. /Udieresismacron/udieresismacron was used in Southern Tutchone but is now written with ü̂ /udieresis/circumflexcomb.
/verticallinebelowcomb is an alternate to /dotbelowcomb. Some Yoruba works use that shape for the dotted letters for example. Having a stylistic set that substitutes the /dotbelowcomb (after decomposing composite glyphs) by the /verticallinebelowcomb would be interesting, but it’s not clear if that’s easily accessible to users who could just use the character instead if they want to display that shape instead.
/yturned is an IPA symbol. It is not used in orthographies. It may appear in text in Heiltsuk or Kwakwala (both North American languages) as a substitute for λ /lambda or the Unicode 16.0 /lambda-latin. A lambda was also used in the 1982 African Reference Alphabet proposed by Michael Mann and David Dalby, but there’s no evidence it has been used in orthographies with the turned y shape or with the lambda shape.
/Zcircumflex/zcircumflex The 1966 Niger national alphabet and the 1978 African Reference Alphabet have a z with right mid hook for Tamasheq, it was replaced by ẓ in the 1999 Niger national alphabet. In some documents hooked letters are substituted with letters with circumflex, it’s possible ẑ was used. Otherwise used in Chilcotin (Canada) or ISO 9 transliteration of Cyrillic.
/Zmacronbelow/zmacronbelow is used in Tahltan and Yatee Zapotec (both in Mexico), or some transliteration systems like for Persian or Hebrew.
Glyph Name | Comment |
---|---|
acutecomb_macroncomb | Can be removed. |
breve-cy | |
brevecomb-cy | |
bulletoperator | Can be removed. |
commaaboverightcomb | Can be removed. |
commaturnedmod | Recommended to keep. Used in Hawaiian, Tongan, Samoan. Transformed composite of /apostrophemod (or /quoteright, but better rounder and slightly larger). |
dieresistonoscomb | Greek, may be the same as dieresiscomb_acutecomb in some fonts |
divisionslash | |
Ebreve | Can be removed. Old orthographies, transliterations, phonetic transcripts or Jarai (Vietnam) are out of scope. |
ebreve | "" |
Ecedillabreve | ISO 259 transliteration of Hebrew |
ecedillabreve | "" |
Ezhcaron | Used in Skolt Sami (Finland) and Laz (Turkey) |
ezhcaron | "" |
firsttonechinese | May be used to illustread use of /macroncomb. Simple composite of /macroncomb or duplicate of /macron. |
Gcircumflex | Esperanto. Aleut (USA). Ad hoc substitution for Ɠ ɠ in Guinea. |
gcircumflex | |
glottalstopreversed | Can be removed. North American languages out of current scope. |
graphemejoinercomb | Can be removed. |
gravecomb_macroncomb | Can be removed. |
hmod | Can be removed. |
Ibreve | Can be removed. Old orthographies, transliterations, phonetic transcripts or Jarai (Vietnam) are out of scope. |
ibreve | "" |
Ismall | Used in one of the Koulango orthographies in Côte d’Ivoire, along with capital /uniA7AE. |
Jcircumflex | Esperanto |
jcircumflex | "" |
kgreenlandic | Old Kalaallisut orthography, maybe still in Inuttitut (both North American) |
lameddageshholam-hb | |
lamedholam-hb | |
Lbelt | North American orthographies |
lbelt | "" |
Ldot | Legacy encoding |
ldot | "" |
longs | Historical |
lowlinecomb | Recommended to keep. May be used instead of /macronbelowcomb or for orthographies that used underline on typewritters. Some orthographies in Gabon might use it as such. |
macroncomb_acutecomb | Can be removed. |
macroncomb_gravecomb | Can be removed. |
Mhook | Can be removed. Phonetic symbol. |
mhook | Can be removed. Phonetic symbol. |
Obreve | Can be removed. See Ebreve |
obreve | "" |
Otildedieresis | Can be removed. May be used in some phonetic transcriptions. |
otildedieresis | "" |
rfishhook | Can be removed. IPA Symbol. |
Rmacronbelow | Pitjantjatjara (Australia), Mitla Zapotec (Mexico), or transliteration Tamil. |
rmacronbelow | "" |
Ubreve | Can be removed. See Ebreve |
ubreve | "" |
Udieresiscaron | Southern Tutchone (Canada), Hanyu Pinyin |
udieresiscaron | "" |
Udieresisgrave | Southern Tutchone (Canada), Hanyu Pinyin |
udieresisgrave | "" |
Udieresismacron | Southern Tutchone (Canada), Hanyu Pinyin |
udieresismacron | "" |
verticallinebelowcomb | Variant of /dotbelowcomb |
wordjoiner | Likely not used in Latin script. |
yturned | Can be removed. IPA. |
Zcircumflex | Can be removed. Chilcotin (Canada) and ISO 9 transliteration. |
zcircumflex | "" |
zerowidthspace | Likely not used in Latin script. |
Zmacronbelow | Tahltan and Yatee Zapotec (Mexico), or transliterations. |
zmacronbelow | "" |
This is great!
Please let me double check the following: zerowidthspace wordjoiner graphemejoinercomb ...as well as the two Hebrew glyphs
For the rest, @vv-monsalve please confirm and I will remove glyphs following the above instructions.
@josescaglione As part of the subsetting job for the per-script families, I've identified some glyphs included in the font that are not part of the required Glyphsets or are not used in the font. Please review them and indicate in which language or context they are used or if they can be ignored/deleted.
acutecomb_macroncomb
breve-cy
brevecomb-cy
, could only one be used?brevecomb-cy
bulletoperator
commaaboverightcomb
commaturnedmod
dieresistonoscomb
divisionslash
Ebreve
GF TransLatin Pinyin
ebreve
GF TransLatin Pinyin
Ecedillabreve
ecedillabreve
Ezhcaron
ezhcaron
firsttonechinese
Gcircumflex
eo_Latn
EsperantoGF Latin Beyond
gcircumflex
eo_Latn
EsperantoGF Latin Beyond
glottalstopreversed
nuk_Latn
NuuchahnulthGF_Latin Beyond for the Americas
graphemejoinercomb
gravecomb_macroncomb
hmod
zap_Latn
ZapotecGF Phonetics IPAStandard
Ibreve
GF TransLatin Pinyin
ibreve
GF TransLatin Pinyin
Ismall
GF Latin African
present in prod names but not in nice names listJcircumflex
eo_Latn
EsperantoGF Latin Beyond
jcircumflex
eo_Latn
EsperantoGF Latin Beyond
kgreenlandic
kl_Latn
KalaallisutGF Latin Beyond
lameddageshholam-hb
lamedholam-hb
Lbelt
hur_Latn
Halkomelem, not part of any glyphsetlbelt
hur_Latn
HalkomelemGF Latin Beyond
Ldot
periodcentered.loclCAT.case
ldot
periodcentered.loclCAT
longs
lowlinecomb
macroncomb_acutecomb
macroncomb_gravecomb
Mhook
mhook
GF Phonetics IPAStandard
Obreve
GF TransLatin Pinyin
obreve
GF TransLatin Pinyin
Otildedieresis
otildedieresis
rfishhook
GF Phonetics IPAStandard
Rmacronbelow
rmacronbelow
Ubreve
GF TransLatin Pinyin
ubreve
GF TransLatin Pinyin
Udieresiscaron
zap_latn
Zapotec, aux,GF TransLatin Pinyin
udieresiscaron
zap_latn
Zapotec, aux,GF TransLatin Pinyin
Udieresisgrave
zap_latn
Zapotec, aux,GF TransLatin Pinyin
udieresisgrave
zap_latn
Zapotec, aux,GF TransLatin Pinyin
Udieresismacron
GF TransLatin Pinyin
udieresismacron
GF TransLatin Pinyin
verticallinebelowcomb
wordjoiner
yturned
GF Phonetics IPAStandard
Zcircumflex
GF Latin Beyond
zcircumflex
GF Latin Beyond
zerowidthspace
Zmacronbelow
GF Latin Beyond
zmacronbelow
GF Latin Beyond