Open teleshoes opened 1 year ago
vosk-timing-data
- run vosk-words-json
on WAV files, get statistics on each word
audio-word-timing
- process vosk-timing-data
into a CSV with three columns: AUDIO_WORD,START_TIME_SECONDS,AUDIO_FILE
audio-word-list
- make a copy of audio-word-timing and remove the START_TIME_SECONDS columnebook-word-list
- process the EPUB/FB2/TXT file into a list of words (one word per line) with pandoc
ebook-audio-diff
- align audio-word-list
and ebook-word-list
diff -y
ebook-word-timing
- combine ebook-word-list
, ebook-audio-diff
, and audio-word-timing
to get ebook timing
ebook-word-list
and add two column, START_TIME_SECONDS and AUDIO_FILEebook-audio-diff
, find the highest index of the word in audio-word-list
that is not part of the longest-common-subsequence of a later wordaudio-word-timing
, and fill in START_TIME_SECONDS/AUDIO_FILE columns with it*.wordtiming
, and is the final output of the perl scriptsentence-info
- when audiobook-tts starts, navigate to each sentence and get info
sentence-text
and the dom-start-pos
ldomXPointerEx->toString()
(it never has any commas)*.sentenceinfo
, if coolreader has write perms where the ebook issentence-words
- parse each sentence into a list of words, as close to step P4)
as possible
sentence-start-times
- compare sentence-words
to ebook-word-timing
file
*.wordtiming
, parse into a list of word/start-time pairssentence-words
to a word in wordtimingstart-playback
- play audiobook instead of TTS
sentence-start-times
continue-playback
- select the next sentence as audiobook position continues
sentence-start-times
, select the next sentence as if user clicked Next >>
moved ebook-audiobook-wordtiming to its own repo: https://github.com/teleshoes/ebook-audiobook-wordtiming
NOTE: this is a working audiobook impl, but it is FAR from polished, and it is NOT plug-and-play. i am already using it all the time, though, so i figured i'd stick it here in case someone else is interested.
FEATURES
USAGE
.wordtiming
text file for each e-book OUTSIDE OF COOLREADERebook-audiobook-wordtiming
is a script for generating this using vosk, pandoc, gnu-diff, python, + perlEXAMPLE