rosettatype / hyperglot

Hyperglot: a database and tools for detecting language support in fonts
http://hyperglot.rosettatype.com
GNU General Public License v3.0
166 stars 23 forks source link

Incorrect languages output. #100

Closed fabiocaccamo closed 1 year ago

fabiocaccamo commented 1 year ago

Hello, thank you for this great library.

I tested this library against many fonts and I noticed that the languages output has some errors:


This is the summary of my implementation:

chars = parse_font_chars(font_path)
langs = Languages()
supported = langs.supported(chars=chars, validity=VALIDITYLEVELS[-1])
moyogo commented 1 year ago

@fabiocaccamo The distinction between dialects and languages is often political or sociological. It’s generally accepted among linguistis that Sicilian, Venetian and others are languages and not dialects of Italian when interintelligebility, vocabulary, grammar and phonology are taken into account. For example, Ethnologue.org considers Sicilian "distinct enough from Standard Italian to be considered a separate language" and a ISO 639-3 language code has been assigned to it. Several works in Sicilian typically also use an orthography distinct from Italian, with additional letters like ḍ which has no use in Italian.

For English, its validity is preliminary.

fabiocaccamo commented 1 year ago

Thank you very much for the explanation and the suggestion on the validity.