IDEMSInternational / idems_translation

GNU General Public License v3.0
0 stars 3 forks source link

Script to find-and-replace in Google Drive folder #25

Closed esmeetewinkel closed 2 years ago

esmeetewinkel commented 2 years ago

What? A script/workflow that takes a .po file (or converted .json equivalent), looks for every source string in all files in a Google Drive folder, and replaces every match with the translated string.

Why? There is a need to proofread the English strings in ParentApp. These all come from spreadsheets in this drive folder, however, proofreading directly in the spreadsheets is not feasible for the editors.

How? I've set up a language pair English --> English, United Kingdom in Crowdin. The output will be a .po file with source strings the original strings and "translated" strings the proofread strings. These should then be fed back into the Google Drive folder, which seems to be possible, see e.g. https://www.labnol.org/code/19926-universal-find-replace-in-google-drive.

istride commented 2 years ago

I'm assuming that it's not feasible to proofread directly in the spreadsheets because the relevant message strings are spread throughout the files in the drive folder and it would be very inconvenient and time-consuming to search for them.

Is Crowdin a necessary part of the whole solution? Alternatively, for example, a script could be written to extract all the message strings (and where they came from) and collect them in a single spreadsheet for review. Such a script could be extended to do the reverse operation of putting proofread strings back where they belong (with the benefit of knowing where they came from).

In any case, rather than using Google's API to manipulate those spreadsheets in Drive, it may be possible to download all the spreadsheets to a local file system, manipulate them as XLSX files (which they appear to be), then upload back to Drive. The advantages would be:

The disadvantages:

esmeetewinkel commented 2 years ago

Your assumption is correct. There are hundreds of spreadsheets, so it would be time consuming for the proofreader to go through them. Also, the spreadsheets take care of the app functionality so if the proofreading is done in these spreadsheets directly the proofreader could potentially mess up the app functionality (e.g. if they change variables).

Crowdin is not necessarily part of the solution. I just thought it would be convenient to use the existing script that takes the spreadsheets as an input and spits out the translatable strings, but I'm happy with any solution. Proofreading the strings in a single spreadsheet sounds good (I'm imagining two columns: original string and corrected string, possibly leaving cell in the corrected string column blank if there's no changes to the original).

I would like to avoid losing the edit history of the spreadsheets, if possible. (I use this quite regularly to look at previous versions. Although we do technically store older versions of the .json files extracted from the spreadsheets on GitHub, we have no pathway to get these back into sheets so it's not good for reverting.) I don't really care about the processing time, I'm not expecting to need to do this often.

esmeetewinkel commented 2 years ago

A down side of using sheets is that it doesn't provide a nice way to check spelling / grammar (or at least not that I'm aware)

istride commented 2 years ago

the existing script that takes the spreadsheets as an input and spits out the translatable strings

Where is this script?

esmeetewinkel commented 2 years ago

Where is this script?

https://github.com/IDEMSInternational/idems_translation/blob/master/app/scripts/extract_texts_script.py

To correct myself it doesn't take spreadsheets as an input directly, but rather json files generated from the spreadsheets.

istride commented 2 years ago

oh... I thought there might have been a secret script in Crowdin.

How are the spreadsheets converted to JSON?

I'm just trying to see if there is a way to simplify this process, but, in the short term, I can concentrate on modifying the original spreadsheets from the PO files from Crowdin. Could you provide a sample PO file from Crowdin?

esmeetewinkel commented 2 years ago

Just discussed this with David briefly, who decided that the better way to go about this is to prioritise our development work on in-app content review mechanisms. Thanks though @istride for helping to think this through. Closing the issue.