alphacep / vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
Apache License 2.0
8.13k stars 1.12k forks source link

Enhanced subtitle generation with CLI options #1319

Open damascene opened 1 year ago

damascene commented 1 year ago

I'm impressed by the quality of generated subtitle in English. I would like see enhanced functionality by having CLI option to have more control on the generated subtitle.

Suggested parameters

Issue

When working on auto generated subtitle I see many sentences ends at non optimal position so I would like to increase "word by line" and "silence to new line" easily so I can later slice them according to my need instead of going back and forth manually between lines to get just one word to the previous one or the later one.

Example

Vosk generated:

11
00:00:50,970 --> 00:00:54,270
millions of Americans motorcycles represent freedom rugged

12
00:00:54,270 --> 00:00:57,120
individualism the pleasures of roaring along the

13
00:00:57,120 --> 00:01:00,390
open road while the wind streams streams

14
00:01:00,390 --> 00:01:01,200
through your hair

Apparently I have to move few words between lines and fix timing.

Would become if I can modify the vosk-transcriber parameters:

11
00:00:50,707 -->00:00:57,659
for millions of Americans motorcycles represent freedom rugged individualism the pleasures of roaring along the open road

12
00:00:57,660 --> 00:01:01,200
while the wind streams streams through your hair

Then I can easily slice them at the points I want using tools like subtitld which allows me to select a position in subtitle and slice lines.

damascene commented 1 year ago

Hey there, any way to solve this?

nshmyrev commented 1 year ago

One has to write some code...