adobe-fonts / source-sans

Sans serif font family for user interface environments
https://adobe-fonts.github.io/source-sans
SIL Open Font License 1.1
3.43k stars 230 forks source link

Researching a list of languages supported by Source Sans font and its spin-offs and looking for the missing glyphs #225

Open NikitaGamer64 opened 2 years ago

NikitaGamer64 commented 2 years ago

Can you please provide a full list of supported languages at this time? Of course, some branches like Source Han Sans series that support Chinese, Japanese and Korean, for example, are also included in my question. I need this list to know what missing or buggy glyphs to pull requests for.

pauldhunt commented 2 years ago

@NikitaGamer64 I’m afraid you will have to audit the fonts yourself to find if they support the languages that you need. I’m not sure it’s even possible for me to tell you every language the fonts support as any set of languages audited for will only be a subset of all the possible languages that can be covered. If you (or anyone else) would like to do this work and contribute your findings back to this project, I am open to including that information as a part of this project.

frankrolf commented 2 years ago

To add to Paul’s comment – the “list of languages supported” depends a lot on the user’s interpretation of what “support” constitutes. Source Sans fully supports Adobe’s AL1 through 4 character sets, more information (including language support of these character sets) about those here: https://github.com/adobe-type-tools/adobe-latin-charsets

For curiosity’s sake, I compared Source Sans’ charset support to AL5 (the largest Latin character set we specify) – Source Sans is short of AL5 by 285 code points:

U+0187 U+0188 U+0189 U+0191 U+0198 U+0199 U+019C U+019D U+019F U+01A4 U+01A5 U+01A9 U+01AC U+01AD U+01AE U+01B2 U+01B3 U+01B4 U+01B5 U+01B6 U+01B8 U+01B9 U+01C4 U+01C5 U+01C6 U+01C7 U+01C8 U+01C9 U+01CA U+01CB U+01CC U+01DE U+01DF U+01E0 U+01E1 U+01E8 U+01E9 U+01EC U+01ED U+01F1 U+01F2 U+01F3 U+0202 U+0203 U+0206 U+0207 U+020A U+020B U+020E U+020F U+0210 U+0211 U+0212 U+0213 U+0214 U+0215 U+0216 U+0217 U+0228 U+0229 U+022A U+022B U+022C U+022D U+022E U+022F U+0230 U+0231 U+0232 U+0233 U+024A U+024B U+024C U+024D U+024E U+024F U+027C U+0286 U+0293 U+0296 U+0297 U+029A U+02A0 U+02A9 U+02AA U+02AB U+02AC U+02AD U+02AE U+02AF U+02B5 U+02B6 U+02BA U+02C2 U+02C3 U+02C4 U+02C5 U+02CD U+02CE U+02CF U+02DF U+02E5 U+02E6 U+02E7 U+02E8 U+02E9 U+02EA U+02EB U+02EC U+02EE U+02EF U+02F0 U+02F1 U+02F2 U+02F3 U+02F4 U+02F5 U+02F6 U+02F7 U+02F8 U+02F9 U+02FA U+02FB U+02FC U+02FD U+02FE U+02FF U+030E U+0314 U+0316 U+0317 U+0321 U+0322 U+032B U+032D U+0333 U+0335 U+0336 U+033E U+033F U+0340 U+0341 U+0346 U+0347 U+0348 U+0349 U+034A U+034B U+034C U+034D U+034E U+0350 U+0351 U+0352 U+0353 U+0354 U+0355 U+0356 U+0359 U+035A U+035B U+035D U+0360 U+0362 U+0363 U+0364 U+0365 U+0366 U+0367 U+0368 U+0369 U+036A U+036B U+036C U+036D U+036E U+036F U+1D05 U+1D5A U+1D5D U+1D5E U+1D60 U+1D61 U+1D7C U+1D7D U+1D9E U+1DA8 U+1DC4 U+1DC5 U+1DC6 U+1DC7 U+1E00 U+1E01 U+1E04 U+1E05 U+1E08 U+1E09 U+1E12 U+1E13 U+1E14 U+1E15 U+1E18 U+1E19 U+1E1A U+1E1B U+1E1C U+1E1D U+1E2C U+1E2D U+1E2E U+1E2F U+1E30 U+1E31 U+1E3C U+1E3D U+1E4A U+1E4B U+1E4C U+1E4D U+1E4E U+1E4F U+1E50 U+1E51 U+1E54 U+1E55 U+1E64 U+1E65 U+1E68 U+1E69 U+1E70 U+1E71 U+1E74 U+1E75 U+1E76 U+1E77 U+1E78 U+1E79 U+1E7A U+1E7B U+1E7C U+1E7D U+1E86 U+1E87 U+1E88 U+1E89 U+1E8A U+1E8B U+1E8C U+1E8D U+1E99 U+200C U+200D U+200E U+200F U+201F U+20A8 U+20AA U+20AD U+20B0 U+20B3 U+266C U+267E U+2C63 U+2C64 U+2C6D U+2C6E U+2C6F U+2C70 U+2C72 U+2C73 U+A78A U+A78B U+A78C U+A78D U+A78E U+A7AA U+A7AB U+A7AC U+A7B0 U+A7B1 U+A7B2 U+A7B4 U+A7B6 U+A7B7 U+FFFD

To round up, here are two language-support testing tools which may or may not give you different results: Underware’s Latin Plus: https://underware.nl/latin_plus/ Rosetta’s Hyperglot: https://hyperglot.rosettatype.com

NikitaGamer64 commented 2 years ago

Let me look up all possible languages and conclude the list. I may also put a suggestion for any missing glyphs

NikitaGamer64 commented 2 years ago

Is it possible to add an enhancement label at this issue without removing question label? Cause I found some symbols that needs to be added: U+037F Ϳ U+03F3 ϳ

NikitaGamer64 commented 2 years ago

I wonder where I can find Source Sans or similar fonts for more scripts like Arabic, Devanagari and more

NikitaGamer64 commented 2 years ago

Found more missing glyphs for Venda: U+1E12 Ḓ U+1E13 ḓ U+1E3C Ḽ U+1E3D ḽ U+1E4A Ṋ U+1E4B ṋ U+1E70 Ṱ U+1E71 ṱ U+032D ̭

NikitaGamer64 commented 2 years ago

U+019D Ɲ U+024C Ɍ U+024D ɍ are missing as well

frankrolf commented 2 years ago

What is your goal? Suggestions for glyph additions will immediately have more traction if there’s a solid reason for their addition.

NikitaGamer64 commented 2 years ago

What is your goal? Suggestions for glyph additions will immediately have more traction if there’s a solid reason for their addition.

I want to make sure that more languages will be supported by this font in the next version

pauldhunt commented 2 years ago

@NikitaGamer64 Thanks for your concern. I’m hoping that the next version of Source Sans will address adding more characters for African languages, but this is something that I have not looked at in a comprehensive manner yet. So in addition to Venda specifically, I'll doing my best to cover all the languages of Africa that use Latin orthographies, but it will be a little while before I get around to that yet.

moyogo commented 2 years ago

U+019D Ɲ should have the same behaviour as U+014A Ŋ, meaning the default form is based on a scaled-up lowercase or based on uppercase N.

NikitaGamer64 commented 2 years ago

@NikitaGamer64 Thanks for your concern. I’m hoping that the next version of Source Sans will address adding more characters for African languages, but this is something that I have not looked at in a comprehensive manner yet. So in addition to Venda specifically, I'll doing my best to cover all the languages of Africa that use Latin orthographies, but it will be a little while before I get around to that yet.

Actually, I’d also like all the languages of Russia that use Cyrillic orthographies covered as well

JimEBlevins commented 3 weeks ago

To round up, here are two language-support testing tools which may or may not give you different results:

As suggested by @frankrolf, I used Rosetta's Hyperglot on the latest regular variable-font release:

  1. Latin script: 399 languages (3.1 billion speakers) of 496 languages

Abron Acheron Achinese Acholi Achuar-Shiwiar Adangme Afar Afrikaans Aghem Aguaruna Ahtna Akoose Alekano Aleut Alutiiq Amahuaca Amarakaeri Amis Anaang, Andaandi Dongolawi, Angas Anufo Anuta Arabela Aragonese Arbëreshë Albanian Asháninka Ashéninka Perené Asturian Atayal Awa-Cuaiquer Awing Baatonum Bafia Balante-Ganja Balinese, Balkan Romani, Baoulé Bari Basque, Batak Dairi, Batak Karo, Batak Mandailing, Batak Simalungun, Batak Toba, Bemba (Zambia), Bena (Tanzania), Biali Bikol Bini Bislama, Boko (Benin), Bora, Borana-Arsi-Guji Oromo, Bosnian Breton Buginese Bushi Candoshi-Shapra Caquinte Caribbean Hindustani Cashibo-Cacataibo Cashinahua Catalan Cebuano, Central Alaskan Yupik, Central Atlas Tamazight, Central Aymara, Central Kurdish, Cerma Chachi Chamorro Chavacano Chayahuita Chickasaw Chiga Chiltepec Chinantec Chokwe Chuukese Cimbrian Cofán Comox, Cook Islands Māori, Cornish Corsican Creek Crimean Tatar Croatian Czech Dagbani Danish Dehu Dimli Dinka Duala Dutch, Eastern Arrernte, Eastern Oromo, Efik English Ewondo Fanti Faroese Fijian Filipino Finnish Foodo French Friulian Ga Gagauz Galician Ganda Garifuna German, Gheg Albanian, Gilbertese Gonja Gooniyandi Gourmanchéma, Guadeloupean Creole French, Guinea Kpelle, Gusii, Gwichʼin, Haitian Halkomelem Hani Hawaiian Hiligaynon Hopi Huastec Hungarian Hän Ibibio Icelandic Idoma Igbo Iloko, Inari Sami, Indonesian Irish, Istro Romanian, Italian, Ixcatlán Mazatec, Jamaican Creole English, Japanese Javanese, Jola-Fonyi, K'iche' Kabuverdianu Kabyle Kaingang Kako, Kala Lagaw Ya, Kalaallisut Kalenjin, Kamba (Kenya), Kaonde Kaqchikel, Kara-Kalpak Karelian Kashubian Kekchí Kenzi, Mattokki Khasi Khoekhoe Kikuyu Kimbundu Kinyarwanda Kirmanjki, Kituba (DRC), Kom (Cameroon), Kongo Konzo Koonzime Krio, Kven Finnish, Kwak’wala Kölsch Ladin Ladino Lakota Lamnso' Langi Latgalian Lingala Lithuanian Lombard, Low German, Lower Sorbian, Lozi, Luba-Lulua, Lukpa, Lule Sami, Luo (Kenya and Tanzania), Luxembourgish, Macedo-Romanian, Madurese Makonde Malagasy Malaysian Maltese Mandinka Mandjak Mankanya Manx Maore Comorian Maori Mapudungun Marshallese Masai Matsés, Mauritian Creole, Megleno Romanian, Mende (Sierra Leone), Meriam Mir, Meru Metlatónoc Mixtec, Mezquital Otomi, Mi'kmaq Minangkabau Mirandese Mizo Mohawk Montagnais Montenegrin Munsee, Murrinh-Patha, Murui Huitoto, Muslim Tat, Mwani Mískito, Naga Pidgin, Navajo Ndonga Neapolitan, Ngazidja Comorian, Ngiemboon, Nigerian Fulfulde, Niuean Nobiin Nomatsiguenga, North Azerbaijani, North Marquesan, North Ndebele, Northeastern Dinka, Northern Kissi, Northern Kurdish, Northern Qiandong Miao, Northern Sami, Northern Uzbek, Norwegian Nuer Nuuchahnulth Nyamwezi Nyanja Nyankole Nyemba Nzima Occitan, Ojitlán Chinantec, Omaha-Ponca, Orma Oroqen Otuho Palauan Pampanga, Papantla Totonac, Papiamento, Paraguayan Guaraní, Pedi Picard, Pichis Ashéninka, Piemontese, Pijin, Pintupi-Luritja, Pipil, Pite Sami, Pohnpeian Polish, Pontic Greek, Portuguese Potawatomi Prussian Purepecha Páez Quechua Romanian Romansh Rotokas Rundi Samoan Sango, Sangu (Tanzania), Saramaccan Sardinian Scots, Scottish Gaelic, Secoya Sena Serbian Seri, Seselwa Creole French, Sharanahua Shawnee Shilluk Shipibo-Conibo Shona Shuar Sicilian Silesian Siona Slovak Slovenian Soga Somali Soninke, South Azerbaijani, South Marquesan, South Ndebele, Southern Aymara, Southern Dagaare, Southern Qiandong Miao, Southern Sami, Southern Samo, Southern Sotho, Spanish, Sranan Tongo, Standard Estonian, Standard Latvian, Standard Malay, Sukuma Sundanese Swahili Swedish, Swiss German, Tachelhit Tagalog Tahitian Talysh Tawallammat Tamajaq, Tedim Chin, Tetum, Tetun Dili, Thompson Ticuna Timne Tlingit Toba, Tok Pisin, Tokelau, Tonga (Tonga Islands), Tonga (Zambia), Tosk Albanian, Tsafiki Tsakhur Tumbuka Turkish Turkmen Tuvalu Twi Tzeltal Tzotzil, Uab Meto, Umbundu, Ume Sami, Upper Guinea Crioulo, Upper Sorbian Urarina, Venetian Veps Vietnamese, Vlax Romani, Võro Wallisian Walloon Walser, Waray (Philippines), Warlpiri Wasa Wayuu Welsh, West Central Oromo, West-Central Limba, Western Abnaki, Western Frisian, Wirad juri, Wolof Xhosa Yagua Yanesha' Yao Yom Yoruba Yucateco Zapotec Zulu Zuni Záparo

  1. Cyrillic script: 59 languages (258 million speakers) of 86

Abaza Adyghe Aghul Archi Avaric Bashkir Belarusian Bezhta Budukh Bulgarian Chamalal Chechen, Chinese Buriat, Chuvash, Crimean Tatar, Dargwa Dido Dungan Erzya, Evenki Halh Mongolian, Ingush, Judeo-Tat, Kabardian Kalmyk, Karachay-Balkar, Karata Kazakh Ket Khinalugh Kirghiz, Komi-Permyak, Koryak Kumyk Lak Lezghian Macedonian Mansi Moksha, Mongolian Buriat, Montenegrin, Muslim Tat, Nanai Nganasan, Nogai Ossetian, Pontic Greek, Russian Buriat, Russian [russian, or muscovite], Rusyn Rutul Serbian Shughni Tabassaran Tajik Tatar Tsakhur Tuvinian Ukrainian

  1. Greek script: 2 languages (14 million speakers) of 2

Modern Greek, Pontic Greek

Commas were added to distinguish names with at least two words. I should be happy to correct any errors.

@NikitaGamer64 Actually, I’d also like all the languages of Russia that use Cyrillic orthographies covered as well

ParaType has been producing typefaces with Cyrillic orthographies with the support of the russian state to commemorate Peter the Great and russian colonialism. Paratype's fonts also enable russian computers to function without non-russian fonts during EU & US sanctions. ParaType's Moscow leadership has also celebrated the current invasion of Ukraine.