KoljaB / RealtimeSTT

A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription.
MIT License
2.09k stars 190 forks source link

Extract Phonemes from script. #92

Open Toolfolks opened 3 months ago

Toolfolks commented 3 months ago

Is it possible to get this data. I am trying to drive a video that has Phonemes at certain position. eg oo (3.4sec-4.1sec).

The idea is to drive (skip) the video to simulate live lip sync.

eg Phonemes list eg P1 P2 etc P4 P5 P3 P4 P2 P3 P3 P4 This is my text from realtime TTS

So as the audio plays the video syncs.

And if possible the list of Phonemes used by RealtimeTTS.