rosettatype / hyperglot

Hyperglot: a database and tools for detecting language support in fonts
http://hyperglot.rosettatype.com
GNU General Public License v3.0
162 stars 22 forks source link

Status of transcription alphabets #49

Closed skurzinz closed 5 months ago

skurzinz commented 3 years ago

What is a transcription alphabet?

Transcription alphabets may be used for several languages, yet they themselves are not languages.

Example: The IJMES transcription alphabet (#31, #48) is used for Arabic, Ottoman Turkish, and Farsi. In itself, it's just one way of representing these language’s glyphs in a latinized environment, specifically in one journal.

Would it make sense to define 'pseudo languages' that may be inherited by real languages that may be transcribed using this?
Upside: the beauty of it
Downside: the complexity of it

Adding an IPA character class to all languages just because most languages (excluding Braille, and likely other edge cases such as undecyphered historical languages where phonetic value is unknown) are representable using IPA may be overkill =)

This may be conceptually related to #32.

AFAICS, apart from ota, currently there are transliteration alphabets present in the jpn, cmn, gan, hak languages, with an additional hint that san may also be transcribed in Latin.

kontur commented 2 years ago

Looking at the data and the example mentioned by @skurzinz I think Hyperglot can already deal with this via inheritance. It would make sense to me to list the transliteration orthography in the most prominent/umbrella language, and given the assumption that the language a transliteration is inherited from has other orthographies with the transliteration's script, the "other" languages with same transliteration can inherit.

E.g. Arabic ara has a Latin transliteration. per and ota could inherit from ara with script Arabic. The status: transliteration would simply be inherit, or if the inherited orthography is itself not a transliteration the status can be explicitly set where inherited.

skurzinz commented 2 years ago

@kontur thanks -- to exaggerate your argument to the point: The original questions What is an alphabet? and How to deal with transcription are conceptually remaining unanswered, with a case by case judgment in place that allows for easy handling.

Suppose Kontur is also written/transliterated as Контур or كونتور with the surrounding language not changing.

But for the scope of hyperglot this is perhaps truly conceptual overload, so 👍

kontur commented 5 months ago

Good reminder and discussion. Let's close this for now and open specific actionable issues for specific transliterations (which we have included some where clear orthographies for such exist for specific languages).