DeepLcom / deepl-python

Official Python library for the DeepL language translation API.
MIT License
1.06k stars 75 forks source link

Tag handling not working? #61

Closed scmanjarrez closed 1 year ago

scmanjarrez commented 1 year ago

Hi, I'm trying to translate a string containing the html tags <p> and <em>, I've tried every option related to the splitting, but it stills split the text.

text: <p><em>Wait, wait, wait! Didn’t we explicitly tell her </em>not<em> to get involved with royalty?!</em></p>

result: 3 translate objects:

expected:

Code lines tested:

>> translator.translate_text(text, split_sentences=deepl.SplitSentences.OFF, target_lang='ES')
>> translator.translate_text(text, split_sentences=deepl.SplitSentences.OFF, target_lang='ES', preserve_formatting=True)
>> translator.translate_text(text, split_sentences=deepl.SplitSentences.OFF, target_lang='ES', preserve_formatting=True, tag_handling='html')
>> translator.translate_text(text, split_sentences=deepl.SplitSentences.OFF, target_lang='ES', preserve_formatting=True, tag_handling='html', non_splitting_tags=['em', 'p'])
>> translator.translate_text(text, split_sentences=deepl.SplitSentences.OFF, target_lang='ES', preserve_formatting=True, tag_handling='html', non_splitting_tags=['em', 'p'], outline_detection=False)

I have tested with tag_handling='xml', but doesn't work either.

scmanjarrez commented 1 year ago

The library is correct. I thought I was using the above text, however I was using bs4.element.Tag and who knows what was being sent to the API.