ThioJoe / Auto-Synced-Translated-Dubs

Automatically translates the text of a video based on a subtitle file, and then uses AI voice services to create a new dubbed & translated audio track where the speech is synced using the subtitle's timings.
GNU General Public License v3.0
1.6k stars 158 forks source link

Donot translate not works for hindi and other Indian languages #52

Closed Thiru-Malai closed 1 year ago

Thiru-Malai commented 1 year ago

Hello, When I put some English text in dont_translate_phrases.txt , it does not get translated to any language. But when I want to translate to Hindi and other Indian languages, it does not skip the translation. I am putting the Hindi text in dont_translate_phrases.txt file. Is there something you can help with it?

avighnac commented 1 year ago

If you want to translate English to Hindi, then why would you put the HINDI phrases in the text file instead of the English phrases?

ThioJoe commented 1 year ago

You actually need to put the text in the original language into the list.

For example, if you are translating from English to Hindi, all the words in the dont_translate file should be in English. If you are going from German to Spanish, they would be in German.

Thiru-Malai commented 1 year ago

Yeah in my example, I am using hindi to english translation, and I dont want to translate some words in Hindi. But when I put the Hindi phrases in donot_translate.csv, it gets translated to English. I guess this feature only applies to certain languages. Is it so?

avighnac commented 1 year ago

Could you provide us with a reproducible example?

Thiru-Malai commented 1 year ago

Yeah, sure… Case – 1: (Working fine as expected) ENG -> Hindi When I want to translate from English to Hindi , This is My original SRT file:

image

Do_not_translate.csv file:

image

My translated SRT file(For hindi)(Working Fine)

image

Case – 2: (Not working) Hindi -> Eng Original SRT file:

image

Do_not_tranlate.txt file:

image

Changed language to be translated to English and also the original language to hindi:

image

image

But the English output somehow gets translated as shown below:

image

ThioJoe commented 1 year ago

Ok it turns out the regex wasn't working with the Unicode characters. But it should be fixed with the latest release 0.11.2.