Closed andrewtavis closed 8 months ago
@andrewtavis can assign me here.
Hey @Jk40git 👋 Before assigning, could you let me know if you've had some experience with some of the technologies beforehand? We're talking about doing a bit of machine translation here. We can work up to that, but if you've never done anything like it before, then maybe we'd need to switch over to a different issue for now to build your skills a bit. We can save this until one of the other translation issues is done - maybe by me - and then you can follow that one as an example.
Okay sounds great. I don't have any experience in machine translation though. okay I will have to switch to a different issue. 👍 Or if possible can you assign me one that will help build my skills?
You can maybe work on this once another one like it is done and you can follow it as an example, @Jk40git :)
I'd suggest:
You can maybe work on this once another one like it is done and you can follow it as an example, @Jk40git :)
I'd suggest:
Okay will go for the #67 first
Feel free to write in there so we can assign :)
Hey @Jk40git 👋 The process has been set up and we're ready to implement here :) It's actually quite streamlined now. If you make a version of scribe_data/extract_transform/languages/English/translations/translate_words.py that replaces SRC_LANG
with French we should be good to go here 😊
Thanks for this, @Jk40git! Closed via #108 with minor edits in 3140c02 :)
Terms
Description
The goal of this issue is to create a process whereby a single file is used to translate all words within French/translations/words_to_translate.json to all other Scribe languages. To achieve this we'll be using m2m100_418M, with the output being a JSON file that has a string and keyed values for each language. This can then be transferred to an SQLite database table with each string in an index corresponding to a column value for each language.
Of specific importance is trying to get a metric of the accuracy of the translation and doing a cutoff such that we're no longer including low quality translations in Scribe applications :)
Contribution
Happy to work on this or support someone with interest in working on it!