Closed pdrhlik closed 5 years ago
In case it escalates quickly there's one more option between - a package ISOcodes:
> ISO_639_2 %>% select(name=Name, code=Alpha_2) %>%
+ filter(code %in% c("en", "cs", "pl", "ro"))
name code
1 Czech cs
2 English en
3 Polish pl
4 Romanian; Moldavian; Moldovan ro
@maciejkasinski I would like to keep the package with as less dependencies as possible. Using ISOcodes seems like a bit of an overkill right now. I'd start to think about it if it starts getting out of our hands.
I think using underscore instead of dash would save us problems in long run
We should probably just use one language for a code. I'd say that there is a major language for each of the shortcuts.
For example the wiki page on ISO 639-1 ro code says Romanian is the preferred one. The codes for the Moldovan language are deprecated. I guess this will be the same for other language groups?
Because of #30 (language dialects), language files should look like this: en_English, cs_Czech, fr-CA_French (Canada).
Template: {LANGCODE}{LANG NAME}
LANG_NAME may contain spaces as in fr-CA_French (Canada). Underscore will be used as a separator between code and name. If we encounter any problems with spaces or parenthesis in the language names, we'll make appropriate changes.
Now:
Proposal:
We could then parse both the language code and code name. It would help automate the process of building the README files. Right now we need to modify a data frame in there for the language to appear in it.
@MarcinKosinski Agree? Or are there any other possibilities? There would be one more. To create a file with the code-language combinations. But storing a file just for that is a bit stupid if you ask me.