rany2 / edge-tts

Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
https://pypi.org/project/edge-tts/
GNU General Public License v3.0
5.38k stars 549 forks source link

Srt file is not in right format and it generates extra lines #67

Closed l3est closed 1 year ago

l3est commented 1 year ago

this is 10.txt's content: سلام دوستان، خوش آمدید به کانال آموزش مهاجرت

and this is the command I'm running: edge-tts --voice fa-IR-DilaraNeural --file 10.txt --write-media 10.mp3 --write-subtitle 10.srt --proxy "http://127.0.0.1:5664"

the audio file is fine but for subtitle I'm getting this (which is invalid when I try to add it in premiere pro):


00:00:00.188 --> 00:00:00.475

سلام

00:00:00.575 --> 00:00:01.100

دوستان

00:00:01.163 --> 00:00:01.413

خوش

00:00:01.450 --> 00:00:01.863

آمدید

00:00:01.863 --> 00:00:01.950

به

00:00:01.962 --> 00:00:02.375

کانال

00:00:02.388 --> 00:00:02.750

آموزش

00:00:02.763 --> 00:00:03.450

مهاجرت

I enabled "show all characters" in notepad++ so you can see exactly what extra characters are in the file: image

the other issue is srt formatting. an standard srt file looks like this:

1
00:05:00,400 --> 00:05:15,300
This is an example of
a subtitle.

2
00:05:16,400 --> 00:05:25,300
This is an example of
a subtitle - 2nd subtitle.

edge-tts generates files like this (dots instead of commas for milli seconds and the index before every timestamp):

00:05:00.400 --> 00:05:15.300
This is an example of
a subtitle.

00:05:16.400 --> 00:05:25.300
This is an example of
a subtitle - 2nd subtitle.
rany2 commented 1 year ago

That's because it's a WebVTT not a SRT file.

l3est commented 1 year ago

my bad I thought it's a custom SRT format 😁 I'll just convert it to SRT