typemytype / drawbot

http://www.drawbot.com
Other
398 stars 61 forks source link

Improve `language()` support / implementation #514

Closed roberto-arista closed 9 months ago

roberto-arista commented 1 year ago

Hey there!

I would like to improve documentation and feedback of the language() feature, both for text funcs and formatted strings.

Reading some documentation from the apple developer website, it looks like CoreText accepts any ISO 639-2 or 639-1 language string abbreviation: https://developer.apple.com/documentation/foundation/nslocale/1418015-isolanguagecodes?language=objc

It would be cool to warn the user if a string provided is not part of that list of languages. Right now, code like this:

language("blablabla")

does not provide any feedback.

@justvanrossum @typemytype what do you think about this?

justvanrossum commented 1 year ago

A warning sounds useful. Are these strings case sensitive or not? If not: make sure to take that into account when checking.

typemytype commented 1 year ago

for now there is already a warning when a language is set which is not available: *** DrawBot warning: Language 'foo' has no hyphenation available. *** (only while drawing text)

language("foo")
hyphenation(True)
text("bar", (10, 10))

a check is maybe a bit more complex then just looking if the language tag is in that list, fe: nl-be or fr-CA, this is the full spec: language-extlang-script-region-variant-extension-privateuse see

so language() could draw some dummy text somewhere that produce this warning..

typemytype commented 1 year ago

had to look it up, this could be added while setting language: https://github.com/typemytype/drawbot/blob/6697d569d5a6b2c5b446083ea6cf79609c31e745/drawBot/drawBotDrawingTools.py#L1519

justvanrossum commented 1 year ago

had to look it up, this could be added while setting language:

But that checks whether there is hyphenation for that language, which may be distinct from "we have a valid language".

typemytype commented 1 year ago

this list all possible options for language(..)

import AppKit
print(AppKit.NSLocale.availableLocaleIdentifiers())
justvanrossum commented 1 year ago

It may have to be something in this direction:

import AppKit

def isLanguageCodeOk(langCode):
    parsedLoc = AppKit.NSLocale.componentsFromLocaleIdentifier_(langCode)
    lang = parsedLoc[AppKit.kCFLocaleLanguageCodeKey]
    country = parsedLoc.get(AppKit.kCFLocaleCountryCodeKey)
    return lang in AppKit.NSLocale.ISOLanguageCodes() and (
        country is None or country in AppKit.NSLocale.ISOCountryCodes()
    )

for langCode in ["ab", "abk", "abz", "AB", "en-us", "en_us", "EN-US", "en-zz"]:
    print(langCode, isLanguageCodeOk(langCode))

Output:

ab True
abk True
abz False
AB True
en-us True
en_us True
EN-US True
en-zz False
typemytype commented 1 year ago

looks good!

strange all en_us, en-us and EN-US are valid

justvanrossum commented 1 year ago

Or maybe something like this after all:

import AppKit

def canonicalLocaleCode(localeCode):
    parsedLoc = AppKit.NSLocale.componentsFromLocaleIdentifier_(localeCode)
    parts = [
        parsedLoc[AppKit.kCFLocaleLanguageCode],
        parsedLoc.get(AppKit.kCFLocaleScriptCode),
        parsedLoc.get(AppKit.kCFLocaleCountryCode),
    ]
    return "_".join(part for part in parts if part)

def isLanguageCodeOk(localeCode):
    localeCode = canonicalLocaleCode(localeCode)
    return localeCode in AppKit.NSLocale.availableLocaleIdentifiers()

for langCode in [
    "ab",
    "abk",
    "abz",
    "AB",
    "en-us",
    "en_us",
    "EN-US",
    "en-zz",
    "sr-cyrl-me",
    "en",
]:
    print(langCode, isLanguageCodeOk(langCode))

Output:

ab False
abk False
abz False
AB False
en-us True
en_us True
EN-US True
en-zz False
sr_Cyrl_ME True
en True