rany2 / edge-srt-to-speech

Convert SubRip to speech using Microsoft Edge's TTS service
https://pypi.org/project/edge-srt-to-speech/
GNU General Public License v3.0
43 stars 10 forks source link

[Feature Request] Export rate information in JSON format #19

Open photkey opened 9 months ago

photkey commented 9 months ago

In actual use, sometimes certain segments are read too quickly, making it difficult to hear clearly. Therefore, it is requested to export the original srt file in JSON format along with the audio file, which includes the actual reading speed for each text segment. With this JSON file, we can achieve better reading effects by re-editing the video or re-editing the srt text.

srt:

1
00:05:00,400 --> 00:05:15,300
If you want to use the edge-tts command, you can simply run it with the following command:

2
00:05:16,400 --> 00:05:25,300
Note the above requires the installation of the mpv command line player.

json: In the following example snippets, rate represents the actual reading speed.

{
  "subtitles": [
    {
      "id": "1",
      "text": "If you want to use the edge-tts command, you can simply run it with the following command:",
      "start_time": "00:05:00.400",
      "end_time": "00:05:15.300",
      "rate": 1.8
    },
    {
      "id": "2",
      "text": "Note the above requires the installation of the mpv command line player.",
      "start_time": "00:05:16.400",
      "end_time": "00:05:25.300",
      "rate": 1.2
    }
  ]
}
photkey commented 9 months ago

I thought about it again, and perhaps a better approach would be to add a parameter that exports only the JSON. In this case, we would only need to read the text once. When encountering text that needs to be read at an accelerated pace, there would be no need to read it again. Additionally, there would be no need to merge all the audio files in the final step. Instead, we would only export a JSON file that contains the rate information. This would make the process more efficient for this particular use case