carpedm20 / emoji

emoji terminal output for Python
Other
1.87k stars 273 forks source link

Partially missing languages #272

Closed ManuelSchneid3r closed 1 year ago

ManuelSchneid3r commented 1 year ago

Tried to write a generic plugin and found that there are a lof of languages missing

for e, d in EMOJI_DATA.items():
    try:
        d['de']
    except KeyError:
        debug(e)
Output

02:42:45 [debg:albert] 🐦‍⬛ 02:42:45 [debg:albert] 👨‍❤‍👨 02:42:45 [debg:albert] 👨🏿‍❤‍👨🏿 02:42:45 [debg:albert] 👨🏿‍❤‍👨🏻 02:42:45 [debg:albert] 👨🏿‍❤‍👨🏾 02:42:45 [debg:albert] 👨🏿‍❤‍👨🏼 02:42:45 [debg:albert] 👨🏿‍❤‍👨🏽 02:42:45 [debg:albert] 👨🏻‍❤‍👨🏻 02:42:45 [debg:albert] 👨🏻‍❤‍👨🏿 02:42:45 [debg:albert] 👨🏻‍❤‍👨🏾 02:42:45 [debg:albert] 👨🏻‍❤‍👨🏼 02:42:45 [debg:albert] 👨🏻‍❤‍👨🏽 02:42:45 [debg:albert] 👨🏾‍❤‍👨🏾 02:42:45 [debg:albert] 👨🏾‍❤‍👨🏿 02:42:45 [debg:albert] 👨🏾‍❤‍👨🏻 02:42:45 [debg:albert] 👨🏾‍❤‍👨🏼 02:42:45 [debg:albert] 👨🏾‍❤‍👨🏽 02:42:45 [debg:albert] 👨🏼‍❤‍👨🏼 02:42:45 [debg:albert] 👨🏼‍❤‍👨🏿 02:42:45 [debg:albert] 👨🏼‍❤‍👨🏻 02:42:45 [debg:albert] 👨🏼‍❤‍👨🏾 02:42:45 [debg:albert] 👨🏼‍❤‍👨🏽 02:42:45 [debg:albert] 👨🏽‍❤‍👨🏽 02:42:45 [debg:albert] 👨🏽‍❤‍👨🏿 02:42:45 [debg:albert] 👨🏽‍❤‍👨🏻 02:42:45 [debg:albert] 👨🏽‍❤‍👨🏾 02:42:45 [debg:albert] 👨🏽‍❤‍👨🏼 02:42:45 [debg:albert] 🧑🏿‍❤‍🧑🏻 02:42:45 [debg:albert] 🧑🏿‍❤‍🧑🏾 02:42:45 [debg:albert] 🧑🏿‍❤‍🧑🏼 02:42:45 [debg:albert] 🧑🏿‍❤‍🧑🏽 02:42:45 [debg:albert] 🧑🏻‍❤‍🧑🏿 02:42:45 [debg:albert] 🧑🏻‍❤‍🧑🏾 02:42:45 [debg:albert] 🧑🏻‍❤‍🧑🏼 02:42:45 [debg:albert] 🧑🏻‍❤‍🧑🏽 02:42:45 [debg:albert] 🧑🏾‍❤‍🧑🏿 02:42:45 [debg:albert] 🧑🏾‍❤‍🧑🏻 02:42:45 [debg:albert] 🧑🏾‍❤‍🧑🏼 02:42:45 [debg:albert] 🧑🏾‍❤‍🧑🏽 02:42:45 [debg:albert] 🧑🏼‍❤‍🧑🏿 02:42:45 [debg:albert] 🧑🏼‍❤‍🧑🏻 02:42:45 [debg:albert] 🧑🏼‍❤‍🧑🏾 02:42:45 [debg:albert] 🧑🏼‍❤‍🧑🏽 02:42:45 [debg:albert] 🧑🏽‍❤‍🧑🏿 02:42:45 [debg:albert] 🧑🏽‍❤‍🧑🏻 02:42:45 [debg:albert] 🧑🏽‍❤‍🧑🏾 02:42:45 [debg:albert] 🧑🏽‍❤‍🧑🏼 02:42:45 [debg:albert] 👩‍❤‍👨 02:42:45 [debg:albert] 👩🏿‍❤‍👨🏿 02:42:45 [debg:albert] 👩🏿‍❤‍👨🏻 02:42:45 [debg:albert] 👩🏿‍❤‍👨🏾 02:42:45 [debg:albert] 👩🏿‍❤‍👨🏼 02:42:45 [debg:albert] 👩🏿‍❤‍👨🏽 02:42:45 [debg:albert] 👩🏻‍❤‍👨🏻 02:42:45 [debg:albert] 👩🏻‍❤‍👨🏿 02:42:45 [debg:albert] 👩🏻‍❤‍👨🏾 02:42:45 [debg:albert] 👩🏻‍❤‍👨🏼 02:42:45 [debg:albert] 👩🏻‍❤‍👨🏽 02:42:45 [debg:albert] 👩🏾‍❤‍👨🏾 02:42:45 [debg:albert] 👩🏾‍❤‍👨🏿 02:42:45 [debg:albert] 👩🏾‍❤‍👨🏻 02:42:45 [debg:albert] 👩🏾‍❤‍👨🏼 02:42:45 [debg:albert] 👩🏾‍❤‍👨🏽 02:42:45 [debg:albert] 👩🏼‍❤‍👨🏼 02:42:45 [debg:albert] 👩🏼‍❤‍👨🏿 02:42:45 [debg:albert] 👩🏼‍❤‍👨🏻 02:42:45 [debg:albert] 👩🏼‍❤‍👨🏾 02:42:45 [debg:albert] 👩🏼‍❤‍👨🏽 02:42:45 [debg:albert] 👩🏽‍❤‍👨🏽 02:42:45 [debg:albert] 👩🏽‍❤‍👨🏿 02:42:45 [debg:albert] 👩🏽‍❤‍👨🏻 02:42:45 [debg:albert] 👩🏽‍❤‍👨🏾 02:42:45 [debg:albert] 👩🏽‍❤‍👨🏼 02:42:45 [debg:albert] 👩‍❤‍👩 02:42:45 [debg:albert] 👩🏿‍❤‍👩🏿 02:42:45 [debg:albert] 👩🏿‍❤‍👩🏻 02:42:45 [debg:albert] 👩🏿‍❤‍👩🏾 02:42:45 [debg:albert] 👩🏿‍❤‍👩🏼 02:42:45 [debg:albert] 👩🏿‍❤‍👩🏽 02:42:45 [debg:albert] 👩🏻‍❤‍👩🏻 02:42:45 [debg:albert] 👩🏻‍❤‍👩🏿 02:42:45 [debg:albert] 👩🏻‍❤‍👩🏾 02:42:45 [debg:albert] 👩🏻‍❤‍👩🏼 02:42:45 [debg:albert] 👩🏻‍❤‍👩🏽 02:42:45 [debg:albert] 👩🏾‍❤‍👩🏾 02:42:45 [debg:albert] 👩🏾‍❤‍👩🏿 02:42:45 [debg:albert] 👩🏾‍❤‍👩🏻 02:42:45 [debg:albert] 👩🏾‍❤‍👩🏼 02:42:45 [debg:albert] 👩🏾‍❤‍👩🏽 02:42:45 [debg:albert] 👩🏼‍❤‍👩🏼 02:42:45 [debg:albert] 👩🏼‍❤‍👩🏿 02:42:45 [debg:albert] 👩🏼‍❤‍👩🏻 02:42:45 [debg:albert] 👩🏼‍❤‍👩🏾 02:42:45 [debg:albert] 👩🏼‍❤‍👩🏽 02:42:45 [debg:albert] 👩🏽‍❤‍👩🏽 02:42:45 [debg:albert] 👩🏽‍❤‍👩🏿 02:42:45 [debg:albert] 👩🏽‍❤‍👩🏻 02:42:45 [debg:albert] 👩🏽‍❤‍👩🏾 02:42:45 [debg:albert] 👩🏽‍❤‍👩🏼 02:42:45 [debg:albert] #⃣ 02:42:45 [debg:albert] *⃣ 02:42:45 [debg:albert] 0⃣ 02:42:45 [debg:albert] 1⃣ 02:42:45 [debg:albert] 2⃣ 02:42:45 [debg:albert] 3⃣ 02:42:45 [debg:albert] 4⃣ 02:42:45 [debg:albert] 5⃣ 02:42:45 [debg:albert] 6⃣ 02:42:45 [debg:albert] 7⃣ 02:42:45 [debg:albert] 8⃣ 02:42:45 [debg:albert] 9⃣ 02:42:45 [debg:albert] 👨‍❤‍💋‍👨 02:42:45 [debg:albert] 👨🏿‍❤‍💋‍👨🏿 02:42:45 [debg:albert] 👨🏿‍❤‍💋‍👨🏻 02:42:45 [debg:albert] 👨🏿‍❤‍💋‍👨🏾 02:42:45 [debg:albert] 👨🏿‍❤‍💋‍👨🏼 02:42:45 [debg:albert] 👨🏿‍❤‍💋‍👨🏽 02:42:45 [debg:albert] 👨🏻‍❤‍💋‍👨🏻 02:42:45 [debg:albert] 👨🏻‍❤‍💋‍👨🏿 02:42:45 [debg:albert] 👨🏻‍❤‍💋‍👨🏾 02:42:45 [debg:albert] 👨🏻‍❤‍💋‍👨🏼 02:42:45 [debg:albert] 👨🏻‍❤‍💋‍👨🏽 02:42:45 [debg:albert] 👨🏾‍❤‍💋‍👨🏾 02:42:45 [debg:albert] 👨🏾‍❤‍💋‍👨🏿 02:42:45 [debg:albert] 👨🏾‍❤‍💋‍👨🏻 02:42:45 [debg:albert] 👨🏾‍❤‍💋‍👨🏼 02:42:45 [debg:albert] 👨🏾‍❤‍💋‍👨🏽 02:42:45 [debg:albert] 👨🏼‍❤‍💋‍👨🏼 02:42:45 [debg:albert] 👨🏼‍❤‍💋‍👨🏿 02:42:45 [debg:albert] 👨🏼‍❤‍💋‍👨🏻 02:42:45 [debg:albert] 👨🏼‍❤‍💋‍👨🏾 02:42:45 [debg:albert] 👨🏼‍❤‍💋‍👨🏽 02:42:45 [debg:albert] 👨🏽‍❤‍💋‍👨🏽 02:42:45 [debg:albert] 👨🏽‍❤‍💋‍👨🏿 02:42:45 [debg:albert] 👨🏽‍❤‍💋‍👨🏻 02:42:45 [debg:albert] 👨🏽‍❤‍💋‍👨🏾 02:42:45 [debg:albert] 👨🏽‍❤‍💋‍👨🏼 02:42:45 [debg:albert] 🧑🏿‍❤‍💋‍🧑🏻 02:42:45 [debg:albert] 🧑🏿‍❤‍💋‍🧑🏾 02:42:45 [debg:albert] 🧑🏿‍❤‍💋‍🧑🏼 02:42:45 [debg:albert] 🧑🏿‍❤‍💋‍🧑🏽 02:42:45 [debg:albert] 🧑🏻‍❤‍💋‍🧑🏿 02:42:45 [debg:albert] 🧑🏻‍❤‍💋‍🧑🏾 02:42:45 [debg:albert] 🧑🏻‍❤‍💋‍🧑🏼 02:42:45 [debg:albert] 🧑🏻‍❤‍💋‍🧑🏽 02:42:45 [debg:albert] 🧑🏾‍❤‍💋‍🧑🏿 02:42:45 [debg:albert] 🧑🏾‍❤‍💋‍🧑🏻 02:42:45 [debg:albert] 🧑🏾‍❤‍💋‍🧑🏼 02:42:45 [debg:albert] 🧑🏾‍❤‍💋‍🧑🏽 02:42:45 [debg:albert] 🧑🏼‍❤‍💋‍🧑🏿 02:42:45 [debg:albert] 🧑🏼‍❤‍💋‍🧑🏻 02:42:45 [debg:albert] 🧑🏼‍❤‍💋‍🧑🏾 02:42:45 [debg:albert] 🧑🏼‍❤‍💋‍🧑🏽 02:42:45 [debg:albert] 🧑🏽‍❤‍💋‍🧑🏿 02:42:45 [debg:albert] 🧑🏽‍❤‍💋‍🧑🏻 02:42:45 [debg:albert] 🧑🏽‍❤‍💋‍🧑🏾 02:42:45 [debg:albert] 🧑🏽‍❤‍💋‍🧑🏼 02:42:45 [debg:albert] 👩‍❤‍💋‍👨 02:42:45 [debg:albert] 👩🏿‍❤‍💋‍👨🏿 02:42:45 [debg:albert] 👩🏿‍❤‍💋‍👨🏻 02:42:45 [debg:albert] 👩🏿‍❤‍💋‍👨🏾 02:42:45 [debg:albert] 👩🏿‍❤‍💋‍👨🏼 02:42:45 [debg:albert] 👩🏿‍❤‍💋‍👨🏽 02:42:45 [debg:albert] 👩🏻‍❤‍💋‍👨🏻 02:42:45 [debg:albert] 👩🏻‍❤‍💋‍👨🏿 02:42:45 [debg:albert] 👩🏻‍❤‍💋‍👨🏾 02:42:45 [debg:albert] 👩🏻‍❤‍💋‍👨🏼 02:42:45 [debg:albert] 👩🏻‍❤‍💋‍👨🏽 02:42:45 [debg:albert] 👩🏾‍❤‍💋‍👨🏾 02:42:45 [debg:albert] 👩🏾‍❤‍💋‍👨🏿 02:42:45 [debg:albert] 👩🏾‍❤‍💋‍👨🏻 02:42:45 [debg:albert] 👩🏾‍❤‍💋‍👨🏼 02:42:45 [debg:albert] 👩🏾‍❤‍💋‍👨🏽 02:42:45 [debg:albert] 👩🏼‍❤‍💋‍👨🏼 02:42:45 [debg:albert] 👩🏼‍❤‍💋‍👨🏿 02:42:45 [debg:albert] 👩🏼‍❤‍💋‍👨🏻 02:42:45 [debg:albert] 👩🏼‍❤‍💋‍👨🏾 02:42:45 [debg:albert] 👩🏼‍❤‍💋‍👨🏽 02:42:45 [debg:albert] 👩🏽‍❤‍💋‍👨🏽 02:42:45 [debg:albert] 👩🏽‍❤‍💋‍👨🏿 02:42:45 [debg:albert] 👩🏽‍❤‍💋‍👨🏻 02:42:45 [debg:albert] 👩🏽‍❤‍💋‍👨🏾 02:42:45 [debg:albert] 👩🏽‍❤‍💋‍👨🏼 02:42:45 [debg:albert] 👩‍❤‍💋‍👩 02:42:45 [debg:albert] 👩🏿‍❤‍💋‍👩🏿 02:42:45 [debg:albert] 👩🏿‍❤‍💋‍👩🏻 02:42:45 [debg:albert] 👩🏿‍❤‍💋‍👩🏾 02:42:45 [debg:albert] 👩🏿‍❤‍💋‍👩🏼 02:42:45 [debg:albert] 👩🏿‍❤‍💋‍👩🏽 02:42:45 [debg:albert] 👩🏻‍❤‍💋‍👩🏻 02:42:45 [debg:albert] 👩🏻‍❤‍💋‍👩🏿 02:42:45 [debg:albert] 👩🏻‍❤‍💋‍👩🏾 02:42:45 [debg:albert] 👩🏻‍❤‍💋‍👩🏼 02:42:45 [debg:albert] 👩🏻‍❤‍💋‍👩🏽 02:42:45 [debg:albert] 👩🏾‍❤‍💋‍👩🏾 02:42:45 [debg:albert] 👩🏾‍❤‍💋‍👩🏿 02:42:45 [debg:albert] 👩🏾‍❤‍💋‍👩🏻 02:42:45 [debg:albert] 👩🏾‍❤‍💋‍👩🏼 02:42:45 [debg:albert] 👩🏾‍❤‍💋‍👩🏽 02:42:45 [debg:albert] 👩🏼‍❤‍💋‍👩🏼 02:42:45 [debg:albert] 👩🏼‍❤‍💋‍👩🏿 02:42:45 [debg:albert] 👩🏼‍❤‍💋‍👩🏻 02:42:45 [debg:albert] 👩🏼‍❤‍💋‍👩🏾 02:42:45 [debg:albert] 👩🏼‍❤‍💋‍👩🏽 02:42:45 [debg:albert] 👩🏽‍❤‍💋‍👩🏽 02:42:45 [debg:albert] 👩🏽‍❤‍💋‍👩🏿 02:42:45 [debg:albert] 👩🏽‍❤‍💋‍👩🏻 02:42:45 [debg:albert] 👩🏽‍❤‍💋‍👩🏾 02:42:45 [debg:albert] 👩🏽‍❤‍💋‍👩🏼

cvzi commented 1 year ago

The translations always lag behind and are usually incomplete. Probably these emoji weren't translated yet, when we last updated the database. I will check if the translations are available now.

ManuelSchneid3r commented 1 year ago

jfyi https://github.com/unicode-org/cldr-json/tree/main/cldr-json/cldr-annotations-derived-full/annotationsDerived

ManuelSchneid3r commented 1 year ago

since the emojis missing are mostly poc emojis i assume the derived annotations are missing

cvzi commented 1 year ago

I wasn't aware that these derived annotations exist. I will add them to our script that generates the database. However I don't think that will solve the problem, because we also use the translations from https://emojiterra.com/de/kopieren/ and emojiterra.com seems to have most of the derived annotations included already.

I didn't check all the missing translations but at least for the variations of 👨🏽‍❤️‍👨🏻 , there are two versions for each emoji, the "fully qualified" form and the "minimally qualified" form. The unicode repository lists only the "fully qualified" forms in the derived annotations.

For example our current database has a translation for 👨🏽‍❤️‍👨🏻 but only for the "fully qualified" version:

    '\U0001F468\U0001F3FD\U0000200D\U00002764\U0000FE0F\U0000200D\U0001F468\U0001F3FB': {  # 👨🏽‍❤️‍👨🏻
        'en': ':couple_with_heart_man_man_medium_skin_tone_light_skin_tone:',
        'status': fully_qualified,
        'E': 13.1,
        'de': ':liebespaar_mann_mann_mittlere_hautfarbe,helle_hautfarbe:',
        'es': ':pareja_enamorada_hombre_hombre_tono_de_piel_medio_tono_de_piel_claro:',
        'fr': ':couple_avec_cœur__homme_homme_peau_légèrement_mate_et_peau_claire:',
        'ja': ':カップルとハート_男性_男性_中間の肌色_薄い肌色:',
        'ko': ':연인_남자_남자_갈색_피부_하얀_피부:',
        'pt': ':casal_apaixonado_homem_homem_pele_morena_e_pele_clara:',
        'it': ':coppia_con_cuore_uomo_uomo_carnagione_olivastra_e_carnagione_chiara:',
        'fa': ':زوج_عاشق_مرد_مرد_پوست_طلایی_و_پوست_سفید:',
        'id': ':pasangan_dengan_hati_pria_pria_warna_kulit_sedang_warna_kulit_cerah:',
        'zh': ':情侣_男人男人中等肤色较浅肤色:'
    },
    '\U0001F468\U0001F3FD\U0000200D\U00002764\U0000200D\U0001F468\U0001F3FB': {  # 👨🏽‍❤‍👨🏻
        'en': ':couple_with_heart_man_man_medium_skin_tone_light_skin_tone:',
        'status': minimally_qualified,
        'E': 13.1
    },

So the script needs to copy "fully qualified" translations to "minimally qualified" forms. I remember I already implemented something like that, but I think it was the opposite way, copy "minimally qualified" translations to "fully qualified".

cvzi commented 1 year ago

It should be complete now i.e. all translations available.

Regardless of this specific problem from this issue, it will probably become incomplete again when Unicode 16 with new emoji is released. I think I should add a hint to the readme or documentation, that explains that. Maybe a similar table as https://carpedm20.github.io/emoji/docs/api.html#emoji-version but with the translation status