05:14:41.490 --> 05:14:42.820
any possible chance
05:14:42.820 --> 05:46:50.320
this text remains for 32 mins in subs
05:46:50.320 --> 05:14:50.420
which is correct
05:14:50.420 --> 05:14:55.710
and transcription is correct
05:14:55.710 --> 05:14:57.460
but it runs over
05:14:57.460 --> 05:15:03.590
so gotta come up with a sed or awk script
05:15:03.590 --> 05:15:05.300
to detect if say subtitle duration
05:15:05.300 --> 05:15:11.220
exceeds 2 mins let's say
Had this happen yesterday also and found it think it was in a 48 hour audiobook I was doing. This one just happened again today with a 10 hour audiobook. So what happens is
this text remains for 32 mins in subs
remains below constantly on for 32 mins and some of the new subtitles that is what can fit show just above it
Obviously to correct this just change
05:14:42.820 --> 05:46:50.320
to
05:14:42.820 --> 05:14:50.320
05:46:50.320 --> 05:14:50.420
to
05:14:50.320 --> 05:14:50.420
in both cases just changing the xx:46:xx.xxx to xx:14:xx.xxx
my current command to pipe wav max length 78 and split at word
for f in *.opus ; do ffmpeg -i "$f" -f wav -ar 16000 -ac 1 - | ~/whisper/whisper.cpp/./main -m ~/whisper/whisper.cpp/models/ggml-medium.en.bin - -ovtt -of "$f" -l en -ml 78 -sow -t 8 ; for f in *.vtt ; do sed -r -i .bak -e 's|Yellow|yellow|g' -e 's|blue|Blue|g' -e 's|Pink|pink|g' "$f" ; done && for i in *opus.vtt ; do mv -i -- "$i" "$(printf '%s\n' "$i" | sed '1s/.opus.vtt/.vtt/')" ; mkdir vttsubs/ ; mv *.vtt vttsubs/ ; done && rm *.bak ; done
I'll try to figure out an awk script to see if it can automatically check duration on a subtitle line say exceeding 2 mins
Had this happen yesterday also and found it think it was in a 48 hour audiobook I was doing. This one just happened again today with a 10 hour audiobook. So what happens is
this text remains for 32 mins in subs remains below constantly on for 32 mins and some of the new subtitles that is what can fit show just above it
Obviously to correct this just change
05:14:42.820 --> 05:46:50.320
to05:14:42.820 --> 05:14:50.320
05:46:50.320 --> 05:14:50.420
to05:14:50.320 --> 05:14:50.420
in both cases just changing the xx:46:xx.xxx to xx:14:xx.xxx
my current command to pipe wav max length 78 and split at word
for f in *.opus ; do ffmpeg -i "$f" -f wav -ar 16000 -ac 1 - | ~/whisper/whisper.cpp/./main -m ~/whisper/whisper.cpp/models/ggml-medium.en.bin - -ovtt -of "$f" -l en -ml 78 -sow -t 8 ; for f in *.vtt ; do sed -r -i .bak -e 's|Yellow|yellow|g' -e 's|blue|Blue|g' -e 's|Pink|pink|g' "$f" ; done && for i in *opus.vtt ; do mv -i -- "$i" "$(printf '%s\n' "$i" | sed '1s/.opus.vtt/.vtt/')" ; mkdir vttsubs/ ; mv *.vtt vttsubs/ ; done && rm *.bak ; done
I'll try to figure out an awk script to see if it can automatically check duration on a subtitle line say exceeding 2 mins