Open Toolfolks opened 3 months ago
Is it possible to get this data. I am trying to drive a video that has Phonemes at certain position. eg oo (3.4sec-4.1sec).
The idea is to drive (skip) the video to simulate live lip sync.
eg Phonemes list eg P1 P2 etc P4 P5 P3 P4 P2 P3 P3 P4 This is my text from realtime TTS
So as the audio plays the video syncs.
And if possible the list of Phonemes used by RealtimeTTS.
Is it possible to get this data. I am trying to drive a video that has Phonemes at certain position. eg oo (3.4sec-4.1sec).
The idea is to drive (skip) the video to simulate live lip sync.
eg Phonemes list eg P1 P2 etc P4 P5 P3 P4 P2 P3 P3 P4 This is my text from realtime TTS
So as the audio plays the video syncs.
And if possible the list of Phonemes used by RealtimeTTS.