DeepLcom / deepl-python

Official Python library for the DeepL language translation API.
MIT License
1.06k stars 75 forks source link

deepl.Formality.MORE results in untranslated text #92

Open pbtsrc opened 5 months ago

pbtsrc commented 5 months ago
import deepl
auth_key = '_____'
client = deepl.Translator(auth_key, send_platform_info=False)
in_text = '''
<?xml version='1.0' encoding='utf-8'?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta content="application/xhtml+xml; charset=utf-8" http-equiv="content-type"/>
<title>title</title>
</head>
<body>
<p>during the day, writing letters, conversing and praying and singing. Between three, and four o'clock at the Prophet's request, Apostle Taylor sang this sweet and comforting poem: </p>
<blockquote>
<p class="poetry"> A poor wayfaring man of grief, <br/>  Hath often cross'd me on my way, <br/>  Who sued so humbly for relief<br/>  That I could never answer  <em>Nay</em>. <br/> </p>
</blockquote>
</body>
</html>
'''
result_1 = client.translate_text(in_text, source_lang='en', target_lang='de', tag_handling='xml', formality=deepl.Formality.MORE)
result_2 = client.translate_text(in_text, source_lang='en', target_lang='de', tag_handling='xml')
print(result_1.text)
print('=================')
print(result_2.text)

The above code returns this:

<?xml version='1.0' encoding='utf-8'?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
 <html xmlns="http://www.w3.org/1999/xhtml">
 <head>
 <meta content="application/xhtml+xml; charset=utf-8" http-equiv="content-type"/>
 <title>Titel</title>
 </head>
 <body>
 <p>während des Tages, schrieben Briefe, unterhielten sich, beteten und sangen. Zwischen drei und vier Uhr sang Apostel Taylor auf Wunsch des Propheten dieses süße und tröstliche Gedicht: </p>
 <blockquote>
 <p class="poetry"> A poor wayfaring man of grief, <br/> Hath often cross'd me on my way, <br/> Who suesfully humble for relief<br/> That I could never answer <em>Nay</em>. <br/> </p>
 </blockquote>
 </body>
 </html>
=================
<?xml version='1.0' encoding='utf-8'?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
 <html xmlns="http://www.w3.org/1999/xhtml">
 <head>
 <meta content="application/xhtml+xml; charset=utf-8" http-equiv="content-type"/>
 <title>Titel</title>
 </head>
 <body>
 <p>während des Tages, schrieben Briefe, unterhielten sich, beteten und sangen. Zwischen drei und vier Uhr sang Apostel Taylor auf Wunsch des Propheten dieses süße und tröstliche Gedicht: </p>
 <blockquote>
 <p class="poetry"> Ein armer, trauriger Wanderer, <br/> Hat mich oft auf meinem Weg gekreuzt, <br/> Der so demütig um Erleichterung bat<br/> Dass ich niemals <em>Nein</em> sagen konnte. <br/> </p>
 </blockquote>
 </body>
 </html>

As you can see, the translate_text() with Formality.MORE did not translate the text "A poor wayfaring man..."

JanEbbing commented 5 months ago

Thanks for reporting, I can replicate this. This looks like an issue related to our translation models, so we cannot give an ETA when this will be fixed, sorry. I reported it internally to the team responsible for the models and they will take it into account.

One thing I noticed is that you set the tag handling to xml, but pass in HTML code - tag_handling=html is probably a better fit (unfortunately it does not resolve this issue, I checked).