-
While using the epitran library for generating the IPA for Hindi, I found that the library doesn't give the expected output.
For example: the input "ऑस्ट्रेलिया" to the transliterate function gives t…
-
I'm proposing a language identification helper module that can:
1. Be used to build language id models using any of the rule based or learning algorithm available for doing this.
2. Be used to id…
-
-
The README lists 109 **available locales**.
Counting the incomplete locales returned by ```thor locales:incomplete```
using #1010 returns 116 ***incomplete locales***.
See also #577
-
### Description:
It has been noted in the codebase that there are translations missing for the following langauges. I have grouped them by their type:
1. **Arabic Languages**:
- [ ] Ara…
-
We have a set of grammatical categories/features in CLDR, that are also used in ICU. It would be very useful to flesh out these categories so that we have a consistent set of identifiers for grammatic…
-
The goal here is to create a Word2vec (CBOW and SkipGram) Colab tutorial to learn word representations for African languages. We would start with English and then migrate to other languages like Yorub…
-
```
The Latin extended character subset in subset.py only includes characters from
Extended A, B, C, D, and Additional (without Vietnamese characters).
But some of those characters are only one of b…
-
```
The Latin extended character subset in subset.py only includes characters from
Extended A, B, C, D, and Additional (without Vietnamese characters).
But some of those characters are only one of b…
-
```
The Latin extended character subset in subset.py only includes characters from
Extended A, B, C, D, and Additional (without Vietnamese characters).
But some of those characters are only one of b…