WeblateOrg / weblate

Web based localization tool with tight version control integration.
https://weblate.org/
GNU General Public License v3.0
4.42k stars 980 forks source link

Correct source translations are required for CSV imports even in monolingual setup #8463

Open marvinruder opened 1 year ago

marvinruder commented 1 year ago

Describe the issue

GIVEN

WHEN

THEN

I already tried

Steps to reproduce the behavior

  1. Setup a Weblate project and component as described above
  2. Import the following CSV file for en_US:
"location","source","target","id","fuzzy","context","translator_comments","developer_comments"
"","Old Translation","Old Translation","","","key","",""
  1. Download a CSV file for fr, which should look like this:
"location","source","target","id","fuzzy","context","translator_comments","developer_comments"
"","Old Translation","","","False","key","",""
  1. Update the en_US string to something else, e.g. "New Translation", using another upload or the GUI
  2. Add a translation to the CSV file downloaded in step 3 and upload it:
"location","source","target","id","fuzzy","context","translator_comments","developer_comments"
"","Old Translation","Traduction en français","","False","key","",""
  1. Observe that the upload of the new fr string fails:
Processed 1 string from the uploaded files (skipped: 0, not found: 1, updated: 0).
  1. Remove the source translation from the CSV file in step 5 and upload it again:
"location","source","target","id","fuzzy","context","translator_comments","developer_comments"
"","","Traduction en français","","False","key","",""
  1. Observe that once again the upload of the new fr string fails:
Processed 1 string from the uploaded files (skipped: 0, not found: 1, updated: 0).
  1. Remove the entire source column from the CSV file in step 7 and upload it again:
"location","target","id","fuzzy","context","translator_comments","developer_comments"
"","Traduction en français","","False","key","",""
  1. Observe that yet again the upload of the new fr string fails:
Processed 1 string from the uploaded files (skipped: 0, not found: 1, updated: 0).

Expected behavior

Since we are working with a monolingual base language file, whose functionality is described in the documentation using the words

monolingual formats identify the string by ID, and each language file contains only the mapping of those to any given language

I expect the import to succeed in both step 6 and 8 since the correct key was specified in both files, and the content of the source column should not matter (the source column should not even be required).

For a monolingual setup as described here, I expect that the key is used for identifying a string rather than its translation in the base language. That’s the whole point in using a monolingual format.

Screenshots

No response

Exception traceback

No response

How do you run Weblate?

Docker container

Weblate versions

Weblate deploy checks

(MWE setup to reproduce the issue, so configuration is not used in production. Results shown below should not be relevant to the issue.)

SystemCheckError: System check identified some issues:

CRITICALS:
?: (weblate.E003) Cannot send e-mail ([Errno 111] Connection refused), please check EMAIL_* settings.
        HINT: https://docs.weblate.org/en/weblate-4.14.2/admin/install.html#out-mail
?: (weblate.E012) The server e-mail address should be changed from its default value
        HINT: https://docs.weblate.org/en/weblate-4.14.2/admin/install.html#production-email
?: (weblate.E013) The "From" e-mail address should be changed from its default value
        HINT: https://docs.weblate.org/en/weblate-4.14.2/admin/install.html#production-email

WARNINGS:
?: (security.W004) You have not set a value for the SECURE_HSTS_SECONDS setting. If your entire site is served only over SSL, you may want to consider setting a value and enabling HTTP Strict Transport Security. Be sure to read the documentation first; enabling HSTS carelessly can cause serious, irreversible problems.
?: (security.W008) Your SECURE_SSL_REDIRECT setting is not set to True. Unless your site should be available over both SSL and non-SSL connections, you may want to either set this setting True or configure a load balancer or reverse-proxy server to redirect all connections to HTTPS.
?: (security.W012) SESSION_COOKIE_SECURE is not set to True. Using a secure-only session cookie makes it more difficult for network traffic sniffers to hijack user sessions.
?: (security.W018) You should not have DEBUG set to True in deployment.

INFOS:
?: (weblate.I021) Error collection is not set up, it is highly recommended for production use
        HINT: https://docs.weblate.org/en/weblate-4.14.2/admin/install.html#collecting-errors
?: (weblate.I028) Backups are not configured, it is highly recommended for production use
        HINT: https://docs.weblate.org/en/weblate-4.14.2/admin/backup.html

System check identified 9 issues (1 silenced).

Additional context

This problem occurred in our project’s workflow in which

  1. a developer introduces a large amount of strings and imports both the keys and a preliminary en_US translation for them
  2. a CSV file is downloaded for several languages, including the en_US base language, and sent to language experts to (for en_US) improve and update the preliminary translations or (for all other languages) to provide translations for those strings
  3. we receive the file with updated base language translations from the en_US expert and import them
  4. we receive translation files for other languages containing the now outdated base language translations as source, leading to the described not found import failure
nijel commented 1 year ago

I don't think Weblate should update non-matching strings just based on key/context – the translation is then for the different string. Weblate only looks at the actual file and doesn't make assumptions about translators doing something more than translating strings in the file. Therefore, the behavior in 6 is correct.

The behavior in 8 is more questionable, as the file still has the source field, but it was wiped.

Still, the main confusion is coming from the fact that you are translating a bilingual CSV file (as the file contains both source and target) and Weblate behaves according to that.

marvinruder commented 1 year ago

I would love to use a monolingual CSV file (this documentation states that CSV supports that), but how can I do that? In my understanding, that would be a file that does not contain source information, so a file like the one I described in step 7, or alternatively the same file just without the source column header. I tried that and received the same error as in step 8.

marvinruder commented 1 year ago

I added step 9 and 10 to the issue to reflect that specific behavior as well.

nijel commented 1 year ago

Okay, it doesn't work as expected in the translate-toolkit – it requires the source column to be present while parsing the CSV file. If it's not there, it assumes the first row is not a header, but first translation and uses hard-coded column names, what obviously can't work well.

github-actions[bot] commented 1 year ago

The issue you've reported needs to be addressed in the translate-toolkit. Please file the issue there, and include links to any relevant specifications about the formats (if applicable).