IDEMSInternational / parenttext-mexico

0 stars 0 forks source link

\r showing up in strings for translation #10

Closed fagiothree closed 8 months ago

fagiothree commented 9 months ago

In pot file, some english source strings have \r in front. This might be due to the use of the sidekick (content was reviewed as google doc and then synced to gsheet). Is the problem at the toolkit, sidekick or translation level?

fagiothree commented 8 months ago

maybe it's a display issue. The same file shows like this: in notepad++ with utf8 image in github desktop image

istride commented 8 months ago

I think this is a problem with the toolkit, specifically, when reading spreadsheets directly from Google Sheets. Line endings are preserved by Google Sheets, no matter what they might be (\r\n or just \n), and the toolkit does not convert line endings. What goes into the spreadsheet is what comes out.

When reading from a file, Python would convert the line endings to be just \n. Therefore, a workaround would be to use the archive feature of the pipeline to save the Google Sheets as CSV, then run the pipeline on the archived CSVs.

I have created an issue in the toolkit because I think that is where the problem should be solved.

fagiothree commented 8 months ago

There are still some occurrences that are not covered but there is no easy way to extend the code at the moment so the best strategy is to remove all the remaining \r manually

istride commented 8 months ago

The fix in the toolkit has been merged into the main branch.