DeepLcom / deepl-php

Official PHP library for the DeepL language translation API.
MIT License
202 stars 23 forks source link

Encode/decode the results for italian and other languages? #17

Closed lusareal closed 1 year ago

lusareal commented 1 year ago

Hi there!

I see when i try to translate the result is wrong some times.

I wonder if there is some encode/decode integrated function to prevent those texts with special caracteres?

Ej: áéú etc?

Or shold i just use usual php functions?

Thank you!

daniel-jones-deepl commented 1 year ago

Hi @lusareal, thanks for the question. The library already encodes the text correctly, so you can enter these characters directly in your text, for example:

$input = "DeepL fornisce traduzioni di alta qualità.";
$result = $translator->translateText($input, null, "en-US");
echo($result->text); // "DeepL provides high-quality translations."

Could you share an example where you get the wrong result with these characters?

daniel-jones-deepl commented 1 year ago

Oh sorry, I misread that you referred to the character á, my example only includes à. This character shouldn't cause a problem either, but I'll check if I can reproduce the issue you described.

daniel-jones-deepl commented 1 year ago

It seems to work with an é too: (note: I am not a native Italian speaker, so I am not sure if this sentence is reasonable).

$input = "La riproduzione di bug in lingue che non si conoscono è di solito pressoché totalmente indovinata.";
$result = $translator->translateText($input, null, "en-US");
echo($result->text); // "Reproducing bugs in unfamiliar languages is usually almost totally guessed."
lusareal commented 1 year ago

Thank you Daniel, i mean translation from en to it using deepl api, using website there is no problem.

Any way can you point me to the code where the code/decode is happening? Or its like on server side of deepl api?

lusareal commented 1 year ago

Here is the example:

English: Apartment in Adeje, city Tijoco Bajo, 86 m2, terrace Italian: Appartamento a Adeje, città Tijoco Bajo, 86 m2, terrazza

The problem: città became città in italian...

Here is my code (ajax function):

require_once('deepl-php/vendor/autoload.php');
    $from='en';
    $to = $lang;
    $authKey = ant_api_key_deepl();
    $translator = new \DeepL\Translator($authKey);

    try {
        $result = $translator->translateText($text, 'en', $to);
        return $result->text;
    } catch (\DeepL\DeepLException $error) {
        return 'Error occurred while translating document: ' . ($error->getMessage() ?? 'unknown error');
    }

    exit();

I hope its more clear now

daniel-jones-deepl commented 1 year ago

I tried your example code, but couldn't reproduce your problem; I get "Appartamento a Adeje, città Tijoco Bajo, 86 m2, terrazza". It seems like a character-encoding issue. Our API and PHP library use UTF-8 for text strings, maybe in your application the text is being changed to a different character-encoding somewhere.

Sure, I can point you to the internals. The HTTP response body is received from cURL here. $result should be the response body UTF-8 encoded as a JSON string like {"translations":[{"detected_source_language":"EN","text":"Appartamento a Adeje, città Tijoco Bajo, 86 m2, terrazza"}]}. If the encoding is already wrong in HTTP response body, possibly there is a cURL setting missing (e.g. indicating that response should be UTF-8).

This section decodes the JSON response to TextResult objects, but no character-encoding should change there.

I hope these hints help.

lusareal commented 1 year ago

Some extra to convert those incorrect symbols to correct one

$translation = "Copérnico was Italian"; $translation = iconv('utf-8', 'latin1', $translation); echo $translation; // Copérnico was Italian