Language lookup helper functions

learningequality / le-utils

Utilities and constants shared across Kolibri, Ricecooker, and Kolibri Studio

MIT License

2 stars 31 forks source link

Language lookup helper functions #28

Closed ivanistheone closed 7 years ago

ivanistheone commented 7 years ago

Summary of changes:

added getlang_by_name for lookup by name, e.g. 'English' works for languages with subcodes e.g. "pt-BR":{"name":"Portuguese, Brazil", .. (comma separated). and for multiple lanages semicolon separated e.g "ca":{"name":"Catalan; Valencian", ..
Added getlang_by_alpha2 for use in YouTubeSubtitleFile using lang lookup strategy inpired by David Hu's TE chef https://github.com/fle-internal/te-sushi-chef/blob/master/te_chef.py#L158
This PR introduces a dependecy on pycountry python package
Bumped version number to 0.1.0 in preparation for PyPI release

rtibbles commented 7 years ago

Just to note that English also has subcodes :)

ivanistheone commented 7 years ago

English, Country-where-they-put-vinegar-on-their-fries ;)

rtibbles commented 7 years ago

English, country where they put gravy on their chips

ivanistheone commented 7 years ago

@divad12 What use case do you have in mind for getlang_closest?

I think something like getlang_closest_by_name could be useful (ignore locale info e.g. zh-Hans, Chinese, Mandarin, Chinese (mainland) etc, all match to zh = Chinese.

Also do you have any ideas how to handle extensible languages? for example a language that we have never seen. I think that one might be out of scope for this PR...

divad12 commented 7 years ago

@ivanistheone The use case I had in mind was for Ricecooker's new smarter YouTubeSubtitleFile that you wrote: https://github.com/learningequality/ricecooker/blob/578a497a2b2cd6f9cb41c37ebfc7753ea148ed8d/ricecooker/classes/files.py#L434

So, e.g. languages.getlang_closest('en'), languages.getlang_closest('zu'), languages.getlang_closest('zul')

That's OK though, I think what you have is fine.

I'm not sure about a language that we have never seen ... I think you're good to say it's out of scope for now.

divad12 commented 7 years ago

FYI this is approved. Looks great, merge when ready @ivanistheone