rany2 / edge-tts

Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
https://pypi.org/project/edge-tts/
GNU Lesser General Public License v3.0
6.36k stars 628 forks source link

The generated VTT file contains unnecessary blank lines. #330

Open ichat006 opened 9 hours ago

ichat006 commented 9 hours ago

I want to display subtitles with the same name while playing MP3 files in the MPV software. However, the VTT subtitle file generated using the edge-tts command contains extra blank lines, causing the subtitle file to fail to display. Here are the details. https://github.com/mpv-player/mpv/issues/15352

Additionally, the temporary .mp3 and .vtt file names generated by edge-playback are different and randomly assigned. This makes it impossible to play subtitles directly. Although parameters can be added to control this, it is not very convenient. I suggest generating files with the same name by default.

rany2 commented 6 hours ago

Can you try the master branch?

ichat006 commented 5 hours ago

Can you try the master branch?

I just started using GitHub and I'm not very familiar with it yet. Could you tell me how to use it? Thank you!

rany2 commented 5 hours ago

pip install https://github.com/rany2/edge-tts/archive/refs/heads/master.zip

ichat006 commented 4 hours ago

pip install https://github.com/rany2/edge-tts/archive/refs/heads/master.zip

Subtitles can now be displayed, but they are shown vertically instead of horizontally. A

rany2 commented 4 hours ago

@ichat006 The new version should be working, try master again please.

ichat006 commented 4 hours ago

@rany2 It's still the same issue, no changes.

Try:

edge-playback --text "我记得我老爸手画的我家祖宅的四合院,好像是两进。不过他画的一塌糊涂,我愣是看不懂我祖父和他们兄弟四个是怎么住的。后来回祖宅去看了一眼,那早就被改的一塌糊涂了,完全看不明白原来的格局。" --voice zh-CN-XiaoxiaoNeural
rany2 commented 3 hours ago

@ichat006 very odd, it's fixed for me...maybe GitHub was providing a cached master zip? Try again in a bit

ichat006 commented 3 hours ago

@rany2 Subtitles can now be displayed, but they are shown as individual characters, which is not reasonable. Before this issue is fixed, the generated subtitle file contains unnecessary blank lines. After manually removing the blank lines, I got the following correct file, which meets the requirements. hello.vtt-ok.zip

The following is the currently generated, unreasonable subtitle file. hello.vtt-Now.zip A

rany2 commented 3 hours ago

I'm aware, I'm trying to figure out how to create proper subtitles but unfortunately it's not an easy task and I need help with it.

Would just persisting the character for a few seconds so it doesn't disappear immediately work?

rany2 commented 29 minutes ago

The biggest issue is that the WordBoundary data Microsoft returns doesn't necessarily match the input text. Microsoft internally transforms the input text to expand acronyms, numbers, etc so it's not as simple as matching the input text with WordBoundary event.