Animenosekai / translate

A module grouping multiple translation APIs
GNU Affero General Public License v3.0
525 stars 60 forks source link

Consider adding DeepL. #5

Closed SuperSonicHub1 closed 3 years ago

SuperSonicHub1 commented 3 years ago

This "issue" is a work-in-progress; feel free to contribute in the comments.

Strangely enough, I found this translation engine while watching a vTuber. This website is pretty complex, so strap in.

Wikipedia: https://en.wikipedia.org/wiki/DeepL_Translator Translator: https://www.deepl.com/translator

Supported Languages

From

[
  ["Any language (detect)", "auto"], // It seems that language detection is done through a JSON-RPC request unrelated to translating, so this is kinda pointless, but I won't remove it.
  ["Chinese", "zh"],
  ["Dutch", "nl"],
  ["English", "en"],
  ["French", "fr"],
  ["German", "de"],
  ["Italian", "it"],
  ["Japanese", "ja"],
  ["Polish", "pl"],
  ["Portuguese", "pt"],
  ["Russian", "ru"],
  ["Spanish", "es"]
]

To

[
  ["English (American)", "en-US"],
  ["English (British)", "en-GB"],
  ["Chinese (simplified)", "zh-ZH"],
  ["Dutch", "nl-NL"], // Formality support
  ["French", "fr-FR"], // Formality support
  ["German", "de-DE"], // Formality support
  ["Italian", "it-IT"], // Formality support
  ["Japanese", "ja-JA"],
  ["Polish", "pl-PL"],  // Formality support
  ["Portuguese", "pt-PT"], // Formality support
  ["Portuguese (Brazilian)", "pt-BR"], // Formality support
  ["Russian", "ru-RU"], // Formality support
  ["Spanish", "es-ES"] // Formality support
]

Formality

This is where it starts to get interesting! Some languages that you can translate to can have differing levels of formality. All languages that do support it are labeled above.

[
  ["Formal tone", "formal"],
  ["Informal tone", "informal"],
  ["Automatic", "auto"]
]

Translating

This API seems to make use of JSON-RPC, which I'm not super familiar with, so assistance would be appreciated. URL: https://www2.deepl.com/jsonrpc HTTP Verb: POST

Example Request/Response

Here, we're translating hello in English to Japanese.

Request Payload (JSON)

{
  "jsonrpc": "2.0",
  "method": "LMT_handle_jobs",
  "params": {
    "jobs": [
      {
        "kind": "default",
        // Ignore use of en here, these same parameters are used for all languages
        "raw_en_sentence": "hello",
        "raw_en_context_before": [],
        "raw_en_context_after": [],
        "preferred_num_beams": 4
      }
    ],
    "lang": {
      "user_preferred_langs": [
        "NL",
        "DE",
        "IT",
        "PL",
        "PT",
        "RU",
        "ES",
        "ZH",
        "FR",
        "JA",
        "EN"
      ],
      // To and from
      "source_lang_computed": "EN",
      "target_lang": "JA"
    },
    "priority": 1,
    "commonJobParams": {},
    "timestamp": 1613084905408
  },
  // No idea what this means; random number?
  "id": 52850007
}

Response

{
  "jsonrpc": "2.0",
  "id": 52850007,
  "result": {
    "translations": [
      {
        "beams": [
          // Recommended translation
          {
            "postprocessed_sentence": "こんにちわ",
            "num_symbols": 6
          },
          // Alternatives
          {
            "postprocessed_sentence": "こんにちは",
            "num_symbols": 3
          },
          {
            "postprocessed_sentence": "ハロー",
            "num_symbols": 3
          },
          {
            "postprocessed_sentence": "もしもし",
            "num_symbols": 3
          }
        ],
        "quality": "normal"
      }
    ],
    "target_lang": "JA",
    "source_lang": "EN",
    "source_lang_is_confident": false,
    "detectedLanguages": {},
    "timestamp": 1613084907,
    "date": "20210211"
  }
}

Dictionary

Alongside translation, we also have access to a dictionary sent via an HTML fragment.

Example Request/Response

URL: https://dict.deepl.com/english-japanese/search?ajax=1&source=english&onlyDictEntries=1&translator=dnsof7h3k2lgh3gda&delay=300&jsStatus=0&kind=full&eventkind=langChange&forleftside=true HTTP Verb: POST

Request Body (Form Data)

query=hello

Response

<div id='data' data-numberQueriesSoFar='0' data-lingueeEncoding='utf-8' data-sourceIsLang1='0' data-lang1='JA' data-lang2='EN' data-mainFlag='us' data-baseURL='/english-japanese' data-query='hello' data-correctSpellingOfQuery='hello' data-numberPhrases='0' data-numberSentences='0' data-sourceLang='EN' data-sourceFlag='us' data-targetLang='JA'></div>
<div class='innercontent'>
    <div id='dictionary'>
        <h1 class='dict_headline_for_0 bothsides wide_in_main'>
            <div class='openTriangle'> &#9662;</div>Dictionary English-Japanese</h1>
        <div class='isMainTerm' data-source-lang='EN'>
            <div class='exact'>
                <div class='lemma featured' wt='0'>
                    <div>
                        <h2 class='line lemma_desc' lid='EN:hello15448'><span class='tag_lemma'><a class='dictLink' rel='nofollow' href='/english-japanese/translation/hello.html'>hello</a> <a id='EN_US/5d/5d41402abc4b2a76b9719d911017c592-0' class='audio' onclick='playSound(this,"EN_US/5d/5d41402abc4b2a76b9719d911017c592-0","American English","EN_UK/5d/5d41402abc4b2a76b9719d911017c592-0","British English");'></a></span><span class='dash'>&mdash;</span></h2>
                        <div class='lemma_content'>
                            <div class='meaninggroup  sortablemg' gid='0'>
                                <div class='translation_lines'>
                                    <div class='translation sortablemg featured'>
                                        <h3 class='translation_desc'><span class='tag_trans' bid='10001294287' lid='JA:#########57716'><a id='dictEntry10001294287' href='/japanese-english/translation/%E3%83%8F%E3%83%AD%E3%83%BC.html' class='dictLink featured'>ハロー</a><a class='expand_i'></a></span>
                                            <!--tag_trans-->
                                        </h3>
                                        <!--translation_desc-->
                                        <!-- editorial isFeatured 1 inType 0 allowFeatured 1 available: 0 -->
                                    </div>
                                    <div class='translation sortablemg featured'>
                                        <h3 class='translation_desc'><span class='tag_trans' bid='10001347706' lid='JA:###############53649'><a id='dictEntry10001347706' href='/japanese-english/translation/%E3%81%93%E3%82%93%E3%81%AB%E3%81%A1%E3%81%AF.html' class='dictLink featured'>こんにちは</a><a class='expand_i'></a></span>
                                            <!--tag_trans-->
                                        </h3>
                                        <!--translation_desc-->
                                        <!-- editorial isFeatured 1 inType 0 allowFeatured 1 available: 0 -->
                                    </div>
                                    <div class='translation_group'>
                                        <div class='line group_line translation_group_line'><span class='notascommon'>less common:</span>
                                            <div class='translation sortablemg translation_first'>
                                                <div class='translation_desc'><span class='tag_trans' bid='10001209742' lid='JA:#########20937'><a id='dictEntry10001209742' href='/japanese-english/translation/%E4%BB%8A%E6%97%A5%E3%81%AF.html' class='dictLink'>今日は</a><a class='expand_i'></a></span>
                                                    <!--tag_trans-->
                                                </div>
                                                <!--translation_desc-->
                                                <!-- editorial isFeatured 0 inType 0 allowFeatured 1 available: 0 -->
                                                <span class='sep'>&middot;</span> </div>
                                            <div class='translation sortablemg'>
                                                <div class='translation_desc'><span class='tag_trans' bid='10001478146' lid='JA:#########18286'><a id='dictEntry10001478146' href='/japanese-english/translation/%E3%81%A9%E3%81%86%E3%82%82.html' class='dictLink'>どうも</a><a class='expand_i'></a></span>
                                                    <!--tag_trans-->
                                                </div>
                                                <!--translation_desc-->
                                                <!-- editorial isFeatured 0 inType 0 allowFeatured 1 available: 0 -->
                                                <span class='sep'>&middot;</span> </div>
                                            <div class='translation sortablemg'>
                                                <div class='translation_desc'><span class='tag_trans' bid='10001347751' lid='JA:###############58474'><a id='dictEntry10001347751' href='/japanese-english/translation/%E3%81%93%E3%82%93%E3%81%AB%E3%81%A1%E3%82%8F.html' class='dictLink'>こんにちわ</a><a class='expand_i'></a></span>
                                                    <!--tag_trans-->
                                                </div>
                                                <!--translation_desc-->
                                                <!-- editorial isFeatured 0 inType 0 allowFeatured 1 available: 0 -->
                                                <span class='sep'>&middot;</span> </div>
                                            <div class='translation sortablemg'>
                                                <div class='translation_desc'><span class='tag_trans' bid='10001241561' lid='JA:############4113'><a id='dictEntry10001241561' href='/japanese-english/translation/%E3%81%93%E3%81%AB%E3%81%A1%E3%81%AF.html' class='dictLink'>こにちは</a></span>
                                                    <!--tag_trans-->
                                                </div>
                                                <!--translation_desc-->
                                                <!-- editorial isFeatured 0 inType 0 allowFeatured 1 available: 0 -->
                                                <span class='sep'>&middot;</span> </div>
                                            <div class='translation sortablemg'>
                                                <div class='translation_desc'><span class='tag_trans' bid='10001292425' lid='JA:############49954'><a id='dictEntry10001292425' href='/japanese-english/translation/%E3%81%BB%E3%81%84%E3%81%BB%E3%81%84.html' class='dictLink'>ほいほい</a></span>
                                                    <!--tag_trans-->
                                                </div>
                                                <!--translation_desc-->
                                                <!-- editorial isFeatured 0 inType 0 allowFeatured 1 available: 0 -->
                                                <span class='sep'>&middot;</span> </div>
                                            <div class='translation sortablemg'>
                                                <div class='translation_desc'><span class='tag_trans' bid='10001528104' lid='JA:#########31519'><a id='dictEntry10001528104' href='/japanese-english/translation/%E3%81%8A%E3%83%BC%E3%81%84.html' class='dictLink'>おーい</a></span>
                                                    <!--tag_trans-->
                                                </div>
                                                <!--translation_desc-->
                                                <!-- editorial isFeatured 0 inType 0 allowFeatured 1 available: 0 -->
                                                <span class='sep'>&middot;</span> </div>
                                            <div class='translation sortablemg'>
                                                <div class='translation_desc'><span class='tag_trans' bid='10001281333' lid='JA:########################9421'><a id='dictEntry10001281333' href='/japanese-english/translation/%E3%82%A2%E3%83%B3%E3%83%8B%E3%83%A7%E3%83%B3%E3%83%8F%E3%82%BB%E3%83%A8.html' class='dictLink'>アンニョンハセヨ</a></span>
                                                    <!--tag_trans-->
                                                </div>
                                                <!--translation_desc-->
                                                <!-- editorial isFeatured 0 inType 0 allowFeatured 1 available: 0 -->
                                                <span class='sep'>&middot;</span> </div>
                                            <div class='translation sortablemg'>
                                                <div class='translation_desc'><span class='tag_trans' bid='10001371063' lid='JA:##################25995'><a id='dictEntry10001371063' href='/japanese-english/translation/%E3%82%A2%E3%83%8B%E3%83%A7%E3%83%8F%E3%82%BB%E3%83%A8.html' class='dictLink'>アニョハセヨ</a></span>
                                                    <!--tag_trans-->
                                                </div>
                                                <!--translation_desc-->
                                                <!-- editorial isFeatured 0 inType 0 allowFeatured 1 available: 0 -->
                                                <span class='sep'>&middot;</span> </div>
                                            <div class='translation sortablemg'>
                                                <div class='translation_desc'><span class='tag_trans' bid='10001346566' lid='JA:############55814'><a id='dictEntry10001346566' href='/japanese-english/translation/%E3%81%93%E3%82%93%E3%81%A1%E3%81%AF.html' class='dictLink'>こんちは</a></span>
                                                    <!--tag_trans-->
                                                </div>
                                                <!--translation_desc-->
                                                <!-- editorial isFeatured 0 inType 0 allowFeatured 1 available: 0 -->
                                                <span class='sep'>&middot;</span> </div>
                                            <div class='translation sortablemg'>
                                                <div class='translation_desc'><span class='tag_trans' bid='10001320067' lid='JA:############36110'><a id='dictEntry10001320067' href='/japanese-english/translation/%E3%83%8F%E3%82%A4%E3%82%B5%E3%82%A4.html' class='dictLink'>ハイサイ</a></span>
                                                    <!--tag_trans-->
                                                </div>
                                                <!--translation_desc-->
                                                <!-- editorial isFeatured 0 inType 0 allowFeatured 1 available: 0 -->
                                                <span class='sep'>&middot;</span> </div>
                                            <div class='translation sortablemg'>
                                                <div class='translation_desc'><span class='tag_trans' bid='10001241611' lid='JA:############34605'><a id='dictEntry10001241611' href='/japanese-english/translation/%E3%81%93%E3%81%AB%E3%81%A1%E3%82%8F.html' class='dictLink'>こにちわ</a></span>
                                                    <!--tag_trans-->
                                                </div>
                                                <!--translation_desc-->
                                                <!-- editorial isFeatured 0 inType 0 allowFeatured 1 available: 0 -->
                                                <span class='sep'>&middot;</span> </div>
                                            <div class='translation sortablemg'>
                                                <div class='translation_desc'><span class='tag_trans' bid='10001336747' lid='JA:############24625'><a id='dictEntry10001336747' href='/japanese-english/translation/%E3%83%8B%E3%83%BC%E3%83%8F%E3%82%AA.html' class='dictLink'>ニーハオ</a></span>
                                                    <!--tag_trans-->
                                                </div>
                                                <!--translation_desc-->
                                                <!-- editorial isFeatured 0 inType 0 allowFeatured 1 available: 0 -->
                                                <span class='sep'>&middot;</span> </div>
                                            <div class='translation sortablemg'>
                                                <div class='translation_desc'><span class='tag_trans' bid='10001295187' lid='JA:######56948'><a id='dictEntry10001295187' href='/japanese-english/translation/%E3%83%8F%E3%83%AD.html' class='dictLink'>ハロ</a></span>
                                                    <!--tag_trans-->
                                                </div>
                                                <!--translation_desc-->
                                                <!-- editorial isFeatured 0 inType 0 allowFeatured 1 available: 0 -->
                                                <span class='sep'>&middot;</span> </div>
                                            <div class='translation sortablemg'>
                                                <div class='translation_desc'><span class='tag_trans' bid='10001214671' lid='JA:############61778'><a id='dictEntry10001214671' href='/japanese-english/translation/%E3%83%9B%E3%82%A4%E3%83%9B%E3%82%A4.html' class='dictLink'>ホイホイ</a></span>
                                                    <!--tag_trans-->
                                                </div>
                                                <!--translation_desc-->
                                                <!-- editorial isFeatured 0 inType 0 allowFeatured 1 available: 0 -->
                                            </div>
                                        </div>
                                    </div>

                                </div>
                                <!--translation_lines-->
                            </div>
                            <!--meaninggroup-->
                        </div>
                        <!--lemma_content-->
                    </div>
                </div>
                <!--lemma-->
            </div>
            <!--exact-->
            <div class='copyrightLineOuter'>
                <div class='copyrightLine'>&copy; Linguee Dictionary, 2020</div>
            </div>
        </div>
    </div>
    <!--dictionary-->
</div>
<!--innercontent-->
<!-- 2021-Feb-11 23:21:03 -->
Animenosekai commented 3 years ago

Hello and thank you very much for this really detailed enhancement request!

I of course know about DeepL (seems like they are sometimes giving better translations than Google Translate) and I will try my best to implement it in future updates!

I just have some questions:

The Formality feature seems very interesting (and I didn't know about it at all)!

Animenosekai

SuperSonicHub1 commented 3 years ago

@Animenosekai Sorry I took so long to respond!

Do you want to keep the HTML format for Dictionary?

I think it would make sense to translate the HTML to a Python dict, and have something like an _origin key for people who want to parse the HTML themselves.

Is there any ID / token system?

After doing some testing in Insomnia, as long as you use the same ID, which can seemingly be any number, you'll be able to dodge 429s.

Animenosekai commented 3 years ago

Hey @SuperSonicHub1 sorry for the late reply!

What do you think of d08de80 ?

I added DeepL (but it is not released yet on PyPI)

SuperSonicHub1 commented 3 years ago

Looking good! See my comments on the commit for more details.

Kyle Williams

Kyle Williams

On Wed, Feb 24, 2021 at 4:56 PM Animenosekai notifications@github.com wrote:

Hey @SuperSonicHub1 sorry for the late reply!

What do you think of d08de80 ?

I added DeepL (but it is not released yet on PyPI)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

Animenosekai commented 3 years ago

Hello (maybe soon good night because it is almost 4AM here) @SuperSonicHub1 !

Take a look at: 89b92cf !

I took into account your comments and I myself improved the code quality using Pylint.

Animenosekai commented 3 years ago

Ok maybe I'll need to remove the type aliases/type hints because of the compatibility issues (the ones I wrote are only supported on Python 3.9 +)

What should I keep @SuperSonicHub1 ?

SuperSonicHub1 commented 3 years ago

Remove 3.9-only typings. I really wish new types were back-ported or available as a PyPI library.

On Wed, Feb 24, 2021, 10:03 PM Animenosekai notifications@github.com wrote:

Ok maybe I'll need to remove the type aliases/type hints because of the compatibility issues (the ones I wrote are only supported on Python 3.9 +)

What should I keep @SuperSonicHub1 https://github.com/SuperSonicHub1 ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Animenosekai/translate/issues/5#issuecomment-785541371, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGDQCMTRVTMS2UBYAONPT43TAW4YVANCNFSM4XPXMRQQ .

Animenosekai commented 3 years ago

@SuperSonicHub1 what do you think of 5d00c6a843e531bf6aa564bee13770a6a4107a13 ?

I read PEP 585 and tried to fix things accordingly.

Vermin now tells me that it should be compatible with Python >=3.2 !

SuperSonicHub1 commented 3 years ago

Great work! Should be able to test DeepL today.

On Thu, Feb 25, 2021, 11:51 AM Animenosekai notifications@github.com wrote:

@SuperSonicHub1 https://github.com/SuperSonicHub1 what do you think of 5d00c6a https://github.com/Animenosekai/translate/commit/5d00c6a843e531bf6aa564bee13770a6a4107a13 ?

I read PEP 585 https://www.python.org/dev/peps/pep-0585/ and tried to fix accordingly.

Vermin now tells me that it should be compatible with Python >=3.2 !

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Animenosekai/translate/issues/5#issuecomment-786045704, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGDQCMWAJ5MH7PSLNGZK37DTAZ5ZHANCNFSM4XPXMRQQ .

Animenosekai commented 3 years ago

The only problem with DeepL is that they don't use any key/token system but their Rate Limit is very strict and I don't know why but they keep blocking me even though I've used it a few times

SuperSonicHub1 commented 3 years ago

I know you're already using browser user-agents, so I wonder what the issue could be... I'll see if I can do some more interrogation of the API later today.

On Thu, Feb 25, 2021, 8:13 PM Animenosekai notifications@github.com wrote:

The only problem with DeepL is that they don't use any ID system but their Rate Limit is very strict and I don't know why but they keep blocking me even though I've used it a few times

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Animenosekai/translate/issues/5#issuecomment-786339598, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGDQCMQYWEIRMNL5YE2Y2BLTA3YSZANCNFSM4XPXMRQQ .

Animenosekai commented 3 years ago

@SuperSonicHub1 Do you think that I can publish the current version?

SuperSonicHub1 commented 3 years ago

I don't see why we can't. If we run into any issues post-publication, we can just make a patch.

Kyle Williams

On Fri, Feb 26, 2021 at 9:21 AM Animenosekai notifications@github.com wrote:

@SuperSonicHub1 https://github.com/SuperSonicHub1 Do you think that I can publish the current version?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Animenosekai/translate/issues/5#issuecomment-786675963, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGDQCMTVP4BXXLEI4MK5LCDTA6U6LANCNFSM4XPXMRQQ .

Animenosekai commented 3 years ago

Should I close this issue?

SuperSonicHub1 commented 3 years ago

Sure; in fact I'll do it for you.

On Sat, Feb 27, 2021, 5:51 PM Animenosekai notifications@github.com wrote:

Should I close this issue?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Animenosekai/translate/issues/5#issuecomment-787199858, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGDQCMTUK2BH4UURRTSYADLTBFZONANCNFSM4XPXMRQQ .