Closed YiYinYinguu closed 1 year ago
You can extract speech marks via https://talkify.net/api/speech/v1/marks endpoint as described at https://manage.talkify.net/docs#api-reference-speech-speech-marks
Example response:
[{"Word":"Proofread","Position":100,"CharPosition":0,"CharPositionOffset":0},{"Word":"your","Position":565,"CharPosition":10,"CharPositionOffset":0}]
Where position is "The starting position (in ms) in the audio stream for the spoken word"
There is no way to get the end of the spoken word, but you can estimate it by checking the start of the next.
I need to use the generated audio to create videos. Therefore, I want to know the time start and time end of each word. I would appreciate a reply.