scribe-org / Scribe-Data

Wikidata and Wikipedia language data extraction
GNU General Public License v3.0
21 stars 24 forks source link

Generate all translations for the currently supported languages [was Colab testing] #70

Open andrewtavis opened 6 months ago

andrewtavis commented 6 months ago

Terms

Issue

As a part of the process to work towards multi lingual translation, we need to test running the translation processes in a

byt3h3ad commented 6 months ago

hey! I would like to help with this issue.

andrewtavis commented 6 months ago

Hey @byt3h3ad 👋 Thanks so much for your offer to help! I'll assign you, and once we have one of the new ones finished we can get to this issue. You'd also be welcome to work on one of the translation issues as well! 😊

andrewtavis commented 5 months ago

Hey @byt3h3ad! We finally have some of the new translation processes up and running. If you wanted to give it a shot using the scribe_data/extract_transform/languages/English/translations/translate_words.py file and document how to get it up and running, then that'd be great!

axif0 commented 1 month ago

hello, @andrewtavis I run the repo in Google Colab As expected it shows same error which shows in the issue - #96

when i update the file translation_utils and put translations = [] same as -#96 , it works in Google Colab. can you please check it kindly?

image

andrewtavis commented 1 month ago

Nice, @axif0! Give me a moment and to do the check here, but this is great!

andrewtavis commented 1 month ago

Assigning you as well to show credit for the work here :)

andrewtavis commented 3 weeks ago

Switching the context of this issue to generating the translations from checking out Google Colab, as as @axif0 it sounds like the processes we have written here can't be finished even using Colab GPUs. I'm going to try to run these things locally over a few nights and then we can call this issue good, as the plan is not to have this process running on machine translations in the long term. Ultimately Scribe-Data will eventually run on Wiktionary based data, so let's close this with the current rendition and then start shifting towards the new methods :)

andrewtavis commented 2 days ago

@axif0, you were the one who'd said that the translation process didn't finish on Collab, right? Did you use GPUs for it, or just CPUs? To my memory they don't have GPUs available by default.