Closed photkey closed 2 years ago
Interesting idea, I'll give it a shot. Either I'll make a separate program that uses this as a library or add it to the already bundled utility. No idea when I'll manage to do this though
Very much looking forward to it! I've spent days scouring GitHub without finding an ideal project, and so far this project of yours is the best open source project I could find. very much looking forward to your new work! Wait for the good news!
Good news, I've added a subrip to mp3 generator in the examples directory.
This is how to use it:
$ python3 ./examples/02_subrip_to_mp3.py test.srt en-US-SaraNeural test.mp3
$ ls -lh test.mp3
-rw------- 1 user user 1.6M Mar 10 12:37 test.mp3
If it works fine for you, let me know so I could close this issue
I realized it is currently slow when dealing multiple hour long SRT. I'll make it a bit faster so it's more usable when you're dealing with that scenario..
You're fantastic, the development speed is amazing, I was expecting a long wait ...... I'll try it out right now and get back to you after testing.
There are codec issues (srt files are utf-8 encoded) and it doesn't work.
Error message: UnicodeDecodeError: 'gbk' codec can't decode byte 0xaf in position 49: illegal multibyte sequence
Another suggestion, would it be a little better to read the SRT file and use either of the two SRT libraries below? https://github.com/cdown/srt https://github.com/byroot/pysrt
Could you try now?
Could you try now?
I've very rarely used Windows so this seems weird especially because you're admin. Do you have any pointers?
Maybe running this without admin could help?
What does the Chinese say?
What does the Chinese say?
PermissionError: [WinError 32] Another program is using this file and the process cannot access it. : 'C:\Users\tuike\AppData\Local\Temp\tmpfo2xxsew.mp3'
Does it work properly now?
Now no errors are reported and it is able to run through, but it does not generate the MP3 file test.mp3.
I ran it again and it reported an error near the end. In Chinese, it means: the signal timeout has been
Do ffmpeg and ffprobe commands work for you?
Anyway could you try it again? Maybe on a smaller SRT so it doesn't take too much of your time..
Yes, FFmpeg I installed via scoop.
The latest one works fine and generates a playable MP3 file successfully, FFmpeg seems to have some error messages.
There is one other issue that has a greater impact on effectiveness. When the time in SRT is less than the text-to-speech time in Microsoft Edge, the speed is automatically adjusted to speed up playback, and this is correct. When the time in SRT is more than the text-to-speech time in Microsoft Edge, the speed should not be adjusted, and a default speech speed parameter should be used that automatically speeds up only when one of the above situations occurs.
When the time in SRT is more than the text-to-speech time in Microsoft Edge, the speed should not be adjusted, and a default speech speed parameter should be used that automatically speeds up only when one of the above situations occurs.
Essentially you don't want the TTS to be sped up to match the SRT?
Yes, because this type of dubbing is mainly used for recorded videos, where you don't need to lip-sync, just finish the words in the corresponding timeline, and it would be weird if the speed of speech is always fast and slow, and it's normal to add subtitles that take longer than the time of speaking. The times when you need to speed up your speech are, shall we say, compelling; get it wrong here and you can't get the timing right later, so you need to speed it up.
To summarize.
Is this what you meant? It now uses argparse as well
Yes, everything is fine and perfect now.
That's very good! I'll bundle it as some kind of extra utility like edge-playback in the future
It's a bit of a shame that this is only used as an example of edge-tts, similar paid apps are all the rage in China, even tutorials related to reading text aloud through a Microsoft Edg browser and then recording it with audio recording software are pretty hot.
Are you suggesting this "example" be a web app or just CLI utility like edge-tts
and edge-playback
?
Either as a web application or CLI utility, I think this it should stand alone and promote it properly, it can easily catch fire because various short video platforms are so popular nowadays, self-publishers need it, many software authors need it too.
Obviously, as a web application, there will be more users (including programmers and ordinary self-publishers) because of the low threshold of getting started; as a CLI utility, it will naturally be more popular with the programmer community.
Now standalone, will add a web interface and README to it: https://github.com/rany2/edge-srt-to-speech
You could install with pip install edge-srt-to-speech
Similar applications. https://voicenotebook.com/srtspeaker.php (Google's text-to-speech is terrible) https://github.com/bdleavitt/azure-text-to-speech-for-dubbing (might have worked well, but didn't run, no reply from the author)
It would be a lot easier for non-verbal programmers to record some videos for screen recording.I once recorded a 13-minute video and the voiceover consumed four whole days and it was a nightmare.Because it would always say the wrong thing and start again ...... I know this project is already great and supports ssml, but editing ssml is also more time consuming, editing srt can be done very quickly with the help of other subtitle software.
Possible problem: Let's say a sentence in SRT has a timeline of 6 seconds, but using Microsoft Edge's text-to-speech service it takes 8 seconds to actually play the sentence; in this case, you need to automatically adjust the speech speed of the sentence.