word by word timestamp or "boundary" event

rhasspy / larynx

End to end text to speech system using gruut and onnx

MIT License

822 stars 48 forks source link

This has been added in Larynx 1.0 via the <mark> SSML tag! It currently only works between sentences, however.

There are two ways to make use of it:

Use --mark-file on the command-line to have the name of each mark printed as its encountered:

larynx -v en --ssml --mark-file /dev/stderr '<mark name="start" />This is a test.<mark name="end" />'

This will print "start" to standard error, say the sentence, then print "end".

Programmatically from the results of the larynx.text_to_speech API. The TextToSpeechResult object (yielded for each sentence) contains a marks_before and marks_after list with the names of the marks that were encountered.

rhasspy / larynx