R3gm / SoniTranslate

Synchronized Translation for Videos. Video dubbing
Apache License 2.0
467 stars 105 forks source link

any way to add srt file in the source? #34

Open ZeppDK opened 5 months ago

ZeppDK commented 5 months ago

hi, translating directly from danish to english is never working correctly with anything i have tried, but can get AI translated subtitless that is 70-80% correct, and then modify them to be understandable in english. so is there any way to either modify whatever SoniTranslate translates, or get it to take an srt file with timing into account when generating new audio?

R3gm commented 5 months ago

Hello, I am improving the quality of the translation, and in the future, I will add other translation options.

ZeppDK commented 5 months ago

this is absolutly awsome... i spun it up yesterday on an ubuntu live disc, and only thing i cant get to work is the voice cloning. and if i use my own subtitles i cant load them and edit them, but i think i can get around this by being more precise when i edit in youtubes editor and change the time codes to allow for longer time slots when deleting unused subs, and making existing subs longer (as the smaller timeframes gives speed ups in the generated audio)

on a side node, it would be awsome with an easy way for running this in windows along site all my video editing, but for now its just going to be dual boot :tada:

coinnbit commented 3 months ago

Hi. I upload my SRT or ASS file in UTF-8 encoding, I tried different subtitle languages (English, Russian, French), there is no difference, I get an error - - Error, To use an SRT file, you need to specify its original language (Source language) ; Warning Make sure to select a 'TTS Speaker' suitable for the translation language to avoid errors with the TTS.

R3gm commented 3 months ago

Hi. I upload my SRT or ASS file in UTF-8 encoding, I tried different subtitle languages (English, Russian, French), there is no difference, I get an error - - Error, To use an SRT file, you need to specify its original language (Source language) ; Warning Make sure to select a 'TTS Speaker' suitable for the translation language to avoid errors with the TTS.

Hi Maybe I need to improve the message. When you use a subtitle file, you have to specify its language here: imagen_2024-05-20_155125276

because automatic language detection is not used with subtitles.

coinnbit commented 3 months ago

Maybe I need to improve the message. When you use a subtitle file, you have to specify its language here:

Yes, there is a nuance, it is not logical - "This is the original language of the video" and the language of the subtitles are different categories and different meanings

coinnbit commented 3 months ago

Another important point is that in the uploaded ASS subtitles, the assembly does not read the distribution by roles. That is, in the editor window, you have to reassign voices for voice-over again. For a long file, this is a serious problem, forcing you to spend considerable time again, and perform tedious manual work.

R3gm commented 3 months ago

Maybe I need to improve the message. When you use a subtitle file, you have to specify its language here:

Yes, there is a nuance, it is not logical - "This is the original language of the video" and the language of the subtitles are different categories and different meanings

In the initial version, only videos were supported, which is why the description was necessary.

Another important point is that in the uploaded ASS subtitles, the assembly does not read the distribution by roles. That is, in the editor window, you have to reassign voices for voice-over again. For a long file, this is a serious problem, forcing you to spend considerable time again, and perform tedious manual work.

Role recognition relies on audio using Pyannote, assigning speakers. Therefore, if there's no audio, the process of identifying speakers can't be carried out.

coinnbit commented 3 months ago

In the initial version, only videos were supported, which is why the description was necessary.

We are talking about an extreme release, why would the user know what happened?

Role recognition relies on audio using Pyannote, assigning speakers. Therefore, if there's no audio, the process of identifying speakers can't be carried out.

I pointed out the problem to you as a user, and you answer me about the work of processing algorithms. Who are you doing this for? If it's just your toy, then there are no more questions.