nidhaloff / deep-translator

A flexible free and unlimited python tool to translate between different languages in a simple way using multiple translators.
https://deep-translator.readthedocs.io/en/latest/?badge=latest
Apache License 2.0
1.59k stars 182 forks source link

Santali - No support for the provided language #213

Closed Prasanta-Hembram closed 1 year ago

Prasanta-Hembram commented 1 year ago

Description

I was trying to translate a sentence from English to Santali, but I'm not getting the translated text. Instead a message is thrown as: No support for the provided language. Instead it should have translated in Santali langauge. Mymemory supports Santali language which can be accessed using this api: https://api.mymemory.translated.net/get?q=What+is+your+name+?&langpair=en|sat

What I Did

== Contents of deep.py == from deep_translator import MyMemoryTranslator

text = 'What is your name ?' translated = MyMemoryTranslator(source='en', target='sat').translate(text)

print(translated)

== Output == I ran: "python deep.py" and got the following output.

Traceback (most recent call last):
  File "deep.py", line 4, in <module>
    translated = MyMemoryTranslator(source='sat', target='en').translate(text)
  File "C:\Users\Prasanta\AppData\Roaming\Python\Python37\site-packages\deep_translator\mymemory.py", line 43, in __init__
    payload_key="q",
  File "C:\Users\Prasanta\AppData\Roaming\Python\Python37\site-packages\deep_translator\base.py", line 44, in __init__
    self._source, self._target = self._map_language_to_code(source, target)
  File "C:\Users\Prasanta\AppData\Roaming\Python\Python37\site-packages\deep_translator\base.py", line 86, in _map_language_to_code
    message=f"No support for the provided language.\n"
deep_translator.exceptions.LanguageNotSupportedException: sat --> No support for the provided language.
Please select on of the supported languages:
{'afrikaans': 'af', 'albanian': 'sq', 'amharic': 'am', 'arabic': 'ar', 'armenian': 'hy', 'assamese': 'as', 'aymara': 'ay', 'azerbaijani': 'az', 'bambara': 'bm', 'basque': 'eu', 'belarusian': 'be', 'bengali': 'bn', 'bhojpuri': 'bho', 'bosnian': 'bs', 'bulgarian': 'bg', 'catalan': 'ca', 'cebuano': 'ceb', 'chichewa': 'ny', 'chinese (simplified)': 'zh-CN', 'chinese (traditional)': 'zh-TW', 'corsican': 'co', 'croatian': 'hr', 'czech': 'cs', 'danish': 'da', 'dhivehi': 'dv', 'dogri': 'doi', 'dutch': 'nl', 'english': 'en', 'esperanto': 'eo', 'estonian': 'et', 'ewe': 'ee', 'filipino': 'tl', 'finnish': 'fi', 'french': 'fr', 'frisian': 'fy', 'galician': 'gl', 'georgian': 'ka', 'german': 'de', 'greek': 'el', 'guarani': 'gn', 'gujarati': 'gu', 'haitian creole': 'ht', 'hausa': 'ha', 'hawaiian': 'haw', 'hebrew': 'iw', 'hindi': 'hi', 'hmong': 'hmn', 'hungarian': 'hu', 'icelandic': 'is', 'igbo': 'ig', 'ilocano': 'ilo', 'indonesian': 'id', 'irish': 'ga', 'italian': 'it', 'japanese': 'ja', 'javanese': 'jw', 'kannada': 'kn', 'kazakh': 'kk', 'khmer': 'km', 'kinyarwanda': 'rw', 'konkani': 'gom', 'korean': 'ko', 'krio': 'kri', 'kurdish (kurmanji)': 'ku', 'kurdish (sorani)': 'ckb', 'kyrgyz': 'ky', 'lao': 'lo', 'latin': 'la', 'latvian': 'lv', 'lingala': 'ln', 'lithuanian': 'lt', 'luganda': 'lg', 'luxembourgish': 'lb', 'macedonian': 'mk', 'maithili': 'mai', 'malagasy': 'mg', 'malay': 'ms', 'malayalam': 'ml', 'maltese': 'mt', 'maori': 'mi', 'marathi': 'mr', 'meiteilon (manipuri)': 'mni-Mtei', 'mizo': 'lus', 'mongolian': 'mn', 'myanmar': 'my', 'nepali': 'ne', 'norwegian': 'no', 'odia (oriya)': 'or', 'oromo': 'om', 'pashto': 'ps', 'persian': 'fa', 'polish': 'pl', 'portuguese': 'pt', 'punjabi': 'pa', 'quechua': 'qu', 'romanian': 'ro', 'russian': 'ru', 'samoan': 'sm', 'sanskrit': 'sa', 'scots gaelic': 'gd', 'sepedi': 'nso', 'serbian': 'sr', 'sesotho': 'st', 'shona': 'sn', 'sindhi': 'sd', 'sinhala': 'si', 'slovak': 'sk', 'slovenian': 'sl', 'somali': 'so', 'spanish': 'es', 'sundanese': 'su', 'swahili': 'sw', 'swedish': 'sv', 'tajik': 'tg', 'tamil': 'ta', 'tatar': 'tt', 'telugu': 'te', 'thai': 'th', 'tigrinya': 'ti', 'tsonga': 'ts', 'turkish': 'tr', 'turkmen': 'tk', 'twi': 'ak', 'ukrainian': 'uk', 'urdu': 'ur', 'uyghur': 'ug', 'uzbek': 'uz', 'vietnamese': 'vi', 'welsh': 'cy', 'xhosa': 'xh', 'yiddish': 'yi', 'yoruba': 'yo', 'zulu': 'zu'}
Vincent-Stragier commented 1 year ago

Hi @Prasanta-Hembram,

Deep Translator rely on a static dictionary which does not contain the target language you are trying to use:

https://github.com/nidhaloff/deep-translator/blob/4328b37c0a03bc29ba6ed61ff5b1e8082373c0a8/deep_translator/constants.py#L28-L162

We do not see it directly, but here the MyMemoryTranslator is defaulting to the languages available on the Google translator: https://github.com/nidhaloff/deep-translator/blob/4328b37c0a03bc29ba6ed61ff5b1e8082373c0a8/deep_translator/mymemory.py#L39-L44

You can add a new dictionary with the language and its corresponding code:

languages = MY_MEMORY_LANGUAGES_TO_CODES

It is possible to create a new dictionary to extend the language support, but you must know the languages available (you can probably contact My Memory to obtain the available languages and the codes).

Prasanta-Hembram commented 1 year ago

Thank you, @Vincent-Stragier, for guiding me. I will soon open a pull request.

Prasanta-Hembram commented 1 year ago

Reference to a list of all the languages: https://www.matecat.com/api/docs#languages

nidhaloff commented 1 year ago

Closed after PR merge