-
### Description
This task involves preparing data for converting colloquial transcripts into grammatically accurate Tibetan transcriptions. The current data is in CSV format. I need to generate JSON f…
-
there should be a small test at init time to see if the ICU collator works for Tibetan, and then use that instead of the JS code if available. This code seems to indicate that it works in modern brows…
-
Description:
- To create a usable Tibetan font, we required a set of glyphs, till now we have been extracting the glyphs from Pecha images using Google OCR or data in OPF.
- These methods could not e…
-
Nice work, but i found a problem that really confuse me.
As shown in the code `omniglot_train_few_shot.py`, both in the training and testing phase, the support set (i.e. sample_images) and evaluati…
-
We should document how vertical text works in Tibetan, e.g., in the spine of a book or embedded in vertical Chinese/Japanese/Korean/Mongolian text.
-
I changed the language names to glottocodes. I could not find glottocodes for some languages. Overall, this means skipping 23 languages. We end up with 253 languages in total.
Kui(Huffman1979), Kui…
-
tibetan char code is 0x0f00 - 0x0xfff.even though some char is bad.but,why did you limit
`('\u0F40'
-
This is a collection of requests for new supported languages. To increase maintenance of the issue tracker, new requests should be added here.
See atleast:
#1493 ancient church latin / italianate …
-
It would be useful to have an option (and a separate function?) to add a final shad to a converted string, taking into account the usual caveats. Tibetan apparently find that really ugly when a name, …
eroux updated
7 years ago
-
### Description
creating a pipeline to successfully download a stt data from github repo tibetan news audio release page and convert the audio into proper format required for training data. then spilt…