Closed Linfye closed 8 months ago
The Scribe team will do our best to address your contribution as soon as we can. The following is a checklist for maintainers to make sure this process goes as well as possible. Feel free to address the points below yourself in further commits if you realize that actions are needed :)
If you're not already a member of our public Matrix community, please consider joining! We'd suggest using Element as your Matrix client, and definitely join the General and Data rooms once you're in. It'd be great to have you!
[x] The commit messages for the remote branch should be checked to make sure the contributor's email is set up correctly so that they receive credit for their contribution
git config user.email
in their local Scribe-Data repo[ ] The CHANGELOG has been updated with a description of the changes for the upcoming release and the corresponding issue (if necessary)
Thanks for this, @Linfye! I'll get back to you with a review soon! Once this is merged you'd be welcome to work on the other languages :)
I fixed the problems you mentioned except the last one. @wkyoshida I wonder now should I work on it or others works on it now. Looking forward to your reply. cc @andrewtavis
I fixed the problems you mentioned except the last one. @wkyoshida I wonder now should I work on it or others works on it now. Looking forward to your reply. cc @andrewtavis
Can you refer to #88 and #89 ? I have implemented a different approach i.e batch processing of words for translation. This way it is relatively faster.
We can decide on a single approach, if the requirement is to iterate over each word rather than batch processing then we can go ahead with this PR. cc @andrewtavis @wkyoshida
I fixed the problems you mentioned except the last one. @wkyoshida I wonder now should I work on it or others works on it now. Looking forward to your reply. cc @andrewtavis
Can you refer to #88 and #89 ? I have implemented a different approach i.e batch processing of words for translation. This way it is relatively faster.
We can decide on a single approach, if the requirement is to iterate over each word rather than batch processing then we can go ahead with this PR. cc @andrewtavis @wkyoshida
I check the code and wonder if continue downloading from last progress cause the words are too much. If it works better, we can adopt yours.
Sorry for the delay on all of this, all :) I was on vacation and then sick right after... Checked and sent along some formatting in 2460584. I'll bring this in shortly as well as the work that's @shashank-iitbhu mentioned. I'll give it all a test to see how things are working. I'd say batch processing and having the process in the utils makes sense to me 😊
Ah, and a quick note on this: let's be sure to remove as much whitespace from JSON outputs as possible in the future as that does bring the file size down slightly 😊
Contributor checklist
Description
The script can run on Google Colab and that's where I code on. Because running the script will take too much time, the translated_words are only a part of all the words. But it shows the feasibility of the program.
Looking forward to your code review
Related issue
ISSUE_NUMBER
72