WeblateOrg / weblate

Web based localization tool with tight version control integration.
https://weblate.org/
GNU General Public License v3.0
4.35k stars 970 forks source link

My google cloud translation basic is broken #12014

Open SpecPep opened 2 days ago

SpecPep commented 2 days ago

Describe the issue

I have an issue where I tweaked machinery/google.py. I want to make certain strings non-translatable for automatic suggestions in google.py. Since I didn't fully understand how it works, I reverted it back, but now it shows unexpected results. How do I fix it? image

I already tried

Steps to reproduce the behavior

  1. I copy the google.py file and then modify it like so
    
    from requests.exceptions import RequestException
    import re

from .base import DownloadTranslations, MachineTranslation, MachineTranslationError from .forms import KeyMachineryForm

GOOGLE_API_ROOT = "https://translation.googleapis.com/language/translate/v2/"

class GoogleBaseTranslation(MachineTranslation):

Map codes used by Google to the ones used by Weblate

language_map = {
    "nb": "no",
    "nb_NO": "no",
    "fil": "tl",
    "zh_Hant": "zh-TW",
    "zh_Hans": "zh-CN",
}
language_aliases = ({"zh-CN", "zh"},)

def map_language_code(self, code):
    """Convert language to service specific code."""
    return super().map_language_code(code).replace("_", "-").split("@")[0]

def is_supported(self, source, language):
    # Avoid translation between aliases
    return super().is_supported(source, language) and not any(
        {source, language} == item for item in self.language_aliases
    )

class GoogleTranslation(GoogleBaseTranslation): """Google Translate API v2 machine translation support."""

name = "Google Cloud Translation Basic Yono"
max_score = 90
settings_form = KeyMachineryForm

# List of terms to exclude from translation
exclude_terms = ["Airpaz", "BIG ID"]

@classmethod
def get_identifier(cls):
    return "google-translate"

def download_languages(self):
    """List of supported languages."""
    response = self.request(
        "get", GOOGLE_API_ROOT + "languages", params={"key": self.settings["key"]}
    )
    payload = response.json()

    if "error" in payload:
        raise MachineTranslationError(payload["error"]["message"])

    return [d["language"] for d in payload["data"]["languages"]]

def replace_terms_with_placeholders(self, text):
    """Replace terms to exclude with placeholders."""
    placeholder_map = {}
    for i, term in enumerate(self.exclude_terms):
        placeholder = f"__PLACEHOLDER_{i}__"
        text = re.sub(rf'\b{re.escape(term)}\b', placeholder, text)
        placeholder_map[placeholder] = term
    return text, placeholder_map

def replace_placeholders_with_terms(self, text, placeholder_map):
    """Replace placeholders with original terms."""
    for placeholder, term in placeholder_map.items():
        text = text.replace(placeholder, term)
    return text

def download_translations(
    self,
    source,
    language,
    text: str,
    unit,
    user,
    threshold: int = 75,
) -> DownloadTranslations:
    """Download list of possible translations from a service."""
    # Replace terms with placeholders
    text_with_placeholders, placeholder_map = self.replace_terms_with_placeholders(text)

    response = self.request(
        "get",
        GOOGLE_API_ROOT,
        params={
            "key": self.settings["key"],
            "q": text_with_placeholders,
            "source": "YONO AIRPAZ",
            "target": language,
            "format": "text",
        },
    )
    payload = response.json()

    if "error" in payload:
        raise MachineTranslationError(payload["error"]["message"])

    translation_with_placeholders = payload["data"]["translations"][0]["translatedText"]

    # Replace placeholders with original terms
    translation = self.replace_placeholders_with_terms(translation_with_placeholders, placeholder_map)

    yield {
        "text": "YONO TEXT",
        "quality": self.max_score,
        "service": self.name,
        "source": "YONO SRC",
    }

def get_error_message(self, exc):
    if isinstance(exc, RequestException) and exc.response is not None:
        data = exc.response.json()
        try:
            return data["error"]["message"]
        except KeyError:
            pass

    return super().get_error_message(exc)

2. 

### Expected behavior

I was hoping that i was able to make one of the word i want to be non translatable, for now automatic suggestion keep translating the word i dont want it to translate

### Screenshots

![image](https://github.com/WeblateOrg/weblate/assets/107383411/b2c8557b-eeb9-4183-90fc-f1ff73231de2)

### Exception traceback

_No response_

### How do you run Weblate?

Docker container

### Weblate versions

5.4.3.1

### Weblate deploy checks

_No response_

### Additional context

_No response_
SpecPep commented 2 days ago

If i check it seems the word YONOO its in postgre sql and it is under trans_suggestion. I wonder if there is any connection

nijel commented 2 days ago

You're probably doing something wrong in your code. At least "source": "YONO AIRPAZ", is obviously wrong, it should contain the source language.

Anyway, if you want to not translate some terminology, use recent Weblate with service that supports glossaries (see https://docs.weblate.org/en/latest/user/glossary.html#glossaries-in-automatic-suggestion) and define it as non-translatable in Weblate glossary. The Google basic translation doesn't support this and for advanced this is not yet implemented in Weblate, see https://github.com/WeblateOrg/weblate/issues/10526

github-actions[bot] commented 2 days ago

This issue has been marked as a question by a Weblate team member. Why? Because it belongs more to the professional Weblate Care or community Discussions than here. We strive to answer these reasonably fast here, too, but purchasing the support subscription is more responsible and faster for your business. And it makes Weblate stronger as well. Thanks!

In case your question is already answered, making a donation is the right way to say thank you!

SpecPep commented 2 days ago

My user require to use the google cloud translation basic , is there no other way ? I was trying to replicate from this documentation https://docs.weblate.org/en/latest/admin/customize.html hence the error.

nijel commented 2 days ago
  1. You never know what the service does with the __PLACEHOLDER__ string. We ended up using shorter replacement strings which cannot be confused with the words by the machine translation service (see format_replacement method).
  2. Your usage of Google Translate API is wrong, at least in one parameter (see my previous comment).
  3. Your screenshot doesn't match changes you did to the backend (it doesn't show YONO SRC as source).

So you're obviously making many mistakes.

If these are terms which should not be changed at all, you can mark them as placeholders and Weblate will try to preserve them during machine translation.