Closed andrewtavis closed 6 months ago
Hello @andrewtavis, I am interested in working on this.
Sounds good, @ikeadeoyin! Let us merge in another process that's for English and then you can use that as a basis. Should be merged by Wednesday 😊
Alright, that is okay.
Hey @ikeadeoyin 👋 The process has been set up and we're ready to implement here :) It's actually quite streamlined now. If you make a version of scribe_data/extract_transform/languages/English/translations/translate_words.py that replaces SRC_LANG
with Italian we should be good to go here 😊
Hey @ikeadeoyin 👋 I went ahead and sent along the change in 2b72e64 as I had a few other things that I needed to get done, and this needed to get finished up :) Hope all's well!
Terms
Description
The goal of this issue is to create a process whereby a single file is used to translate all words within Italian/translations/words_to_translate.json to all other Scribe languages. To achieve this we'll be using m2m100_418M, with the output being a JSON file that has a string and keyed values for each language. This can then be transferred to an SQLite database table with each string in an index corresponding to a column value for each language.
Of specific importance is trying to get a metric of the accuracy of the translation and doing a cutoff such that we're no longer including low quality translations in Scribe applications :)
Contribution
Happy to work on this or support someone with interest in working on it!