fixes #72: A script teanslates English to other languages

Linfye commented 8 months ago

Contributor checklist

[] This pull request is on a separate branch and not the main branch

Description

The script can run on Google Colab and that's where I code on. Because running the script will take too much time, the translated_words are only a part of all the words. But it shows the feasibility of the program.

Looking forward to your code review

Related issue

ISSUE_NUMBER

72

github-actions[bot] commented 8 months ago

Thank you for the pull request!

The Scribe team will do our best to address your contribution as soon as we can. The following is a checklist for maintainers to make sure this process goes as well as possible. Feel free to address the points below yourself in further commits if you realize that actions are needed :)

If you're not already a member of our public Matrix community, please consider joining! We'd suggest using Element as your Matrix client, and definitely join the General and Data rooms once you're in. It'd be great to have you!

Maintainer checklist

[x] The commit messages for the remote branch should be checked to make sure the contributor's email is set up correctly so that they receive credit for their contribution
- The contributor's name and icon in remote commits should be the same as what appears in the PR
- If there's a mismatch, the contributor needs to make sure that the email they use for GitHub matches what they have for git config user.email in their local Scribe-Data repo
[ ] The CHANGELOG has been updated with a description of the changes for the upcoming release and the corresponding issue (if necessary)

andrewtavis commented 8 months ago

Thanks for this, @Linfye! I'll get back to you with a review soon! Once this is merged you'd be welcome to work on the other languages :)

Linfye commented 8 months ago

I fixed the problems you mentioned except the last one. @wkyoshida I wonder now should I work on it or others works on it now. Looking forward to your reply. cc @andrewtavis

shashank-iitbhu commented 8 months ago

I fixed the problems you mentioned except the last one. @wkyoshida I wonder now should I work on it or others works on it now. Looking forward to your reply. cc @andrewtavis

Can you refer to #88 and #89 ? I have implemented a different approach i.e batch processing of words for translation. This way it is relatively faster.

We can decide on a single approach, if the requirement is to iterate over each word rather than batch processing then we can go ahead with this PR. cc @andrewtavis @wkyoshida

Linfye commented 8 months ago

I fixed the problems you mentioned except the last one. @wkyoshida I wonder now should I work on it or others works on it now. Looking forward to your reply. cc @andrewtavis

Can you refer to #88 and #89 ? I have implemented a different approach i.e batch processing of words for translation. This way it is relatively faster.

We can decide on a single approach, if the requirement is to iterate over each word rather than batch processing then we can go ahead with this PR. cc @andrewtavis @wkyoshida

I check the code and wonder if continue downloading from last progress cause the words are too much. If it works better, we can adopt yours.

andrewtavis commented 8 months ago

Sorry for the delay on all of this, all :) I was on vacation and then sick right after... Checked and sent along some formatting in 2460584. I'll bring this in shortly as well as the work that's @shashank-iitbhu mentioned. I'll give it all a test to see how things are working. I'd say batch processing and having the process in the utils makes sense to me 😊

andrewtavis commented 8 months ago

Ah, and a quick note on this: let's be sure to remove as much whitespace from JSON outputs as possible in the future as that does bring the file size down slightly 😊

scribe-org / Scribe-Data