hhhwwwuuu / BackTranslation

back translation for NLP
https://pypi.org/project/BackTranslation/
MIT License
25 stars 2 forks source link
backtranslation baidu-translation-api nlp

BackTranslation

version Downloads license

BackTranslation is a python library that implemented to back translate the words among any two languages. This utilizes googletrans library and Baidu Translation API to translate the words.

Since there is an error in current verison of googletrans, you have to create only one instance to do back-translation for your work. Otherwise, it is easy to cause a bug from multi-requests. We will keep implementing this library with other translator libraries soon.

If you face any bug, you can open a issue in Github.

Installation

You can install it from PyPI:

$ pip install BackTranslation

Usage

Backtranslation with googletrans

Translate the original text to other language and translate back to augment the diversity of data in NLP research.

Parameters:

Return parameter: object Translated.

Attributes:

from BackTranslation import BackTranslation
trans = BackTranslation(url=[
      'translate.google.com',
      'translate.google.co.kr',
    ], proxies={'http': '127.0.0.1:1234', 'http://host.name': '127.0.0.1:4012'})
result = trans.translate('hello', src='en', tmp = 'zh-cn')
print(result.result_text)
# 'Hello there'

Note: You just need to create one instance of BackTranslation in order to avoid the issue in current version of googletrans.

Search the language code

You may find out your language code with full language name by using this method.

Parameters:

from BackTranslation import BackTranslation
trans = BackTranslation()
trans.searchLanguage('Chinese')
# {'chinese (simplified)': 'zh-cn', 'chinese (traditional)': 'zh-tw'}

Backtranslation_Baidu with Baidu Translation API

To use this stable translation, you are required to register in Baidu Translation API for getting your own appID. It supports 2 million chacters per day for free. Note: Currently, they only support Chinese phone number to register the accout.

from BackTranslation import BackTranslation_Baidu
trans = BackTranslation_Baidu(appid='YOUR APPID', secretKey='YOUR SECRETKEY')
result = trans.translate('hello', src='auto', tmp='zh')
print(result.result_text)
# 'hello'
trans.closeHTTP()

Seach language code

Since Baidu provides the different language code, it will be updated soon.

Version Information

Version 0.3.1: fix some bugs for Baidu translator.

Version 0.2.2: fix the services url for Google Translator.

Version 0.2.1: fix the small bug. From this version, the library googletrans version is 4.0.0rc1.

Version 0.2.0: support back-translation with Baidu API, and fix bugs

Version 0.1.0: support back-translation with googletrans library

Contribution

Welcome to contribute BackTranslation library!

reference