Closed ChiragMoradiya closed 11 months ago
Splitting the text needs to be evaluated as well as it might negatively impact translation if Google treats segments as independent. For example, in Czech this should be translated as "Ahoj\nsvěte"
while independent words would be translated as Ahoj
, svět
.
Meanwhile, I've added highlighting of whitespace in machine translation results in https://github.com/WeblateOrg/weblate/pull/10147, so that it is clearly visible.
This issue has been put aside. It is currently unclear if it will ever be implemented as it seems to cover too narrow of a use case or doesn't seem to fit into Weblate.
Please try to clarify the use case or consider proposing something more generic to make it useful to more users.
So, it looks like this should be reported as a Bug to Google Transtate API service, instead here.
I don't think it's realistic to expect machine translation to always keep newlines at the right place.
I agree it won't always keep. But, here the issue is that, it always removes. It's never preserved.
We have added many mark-down texts as a Weblate Strings. And by not preserving new-line characters, their formatting gets screwed with auto-translation.
How we did it in another project https://github.com/arvin-pantas/django-autotranslate/commit/85f0d8d6567070411b8abefa0c32391f2f52fb47
@pickfire Thanks, that is useful!
Describe the issue
When a String contains a new line character (
\n
). It's "Google Translate V3" auto-translation, replaces new-line character with a single space.But, AWS Translation preserves new-line characters properly.
See Screenshots for the reference.
I already tried
Steps to reproduce the behavior
Hello\nWorld
हैलो वर्ल्ड
(ISSUE)नमस्ते\nवर्ल्ड
Expected behavior
Auto Translation in Hindi, using "Google Translate V3" should be
हैलो\nवर्ल्ड
Screenshots
Google Translate V3 API:
AWS API:
Exception traceback
No response
How do you run Weblate?
Docker container
Weblate versions
Weblate deploy checks
Additional context
There seems an issue in REST API invocation in https://github.com/WeblateOrg/weblate/blob/c7915cc0954da39169621ce3bbfa19ba189583fe/weblate/machinery/googlev3.py#L63C16-L63C16
It sends whole String as an element in the Array. e.g.
["Hello\nWorld"]
.If it sends request as multiple array elements, split by new-line characters. e.g.
["Hello","world"]
. And then join response elements back by new-line character, then this issue might be resolved.NOTE: There is an additional cavity, if the request array contains any blank string; then Google treats this as an invalid request. So, such elements should be removed from the request element.