p-groarke / wsay

Windows "say"
BSD 3-Clause "New" or "Revised" License
141 stars 11 forks source link

How does speech XML work (which format)? #12

Closed caydenmascarenhas closed 2 years ago

caydenmascarenhas commented 2 years ago

When you said we could use speech xml i was quite happy as I wanna do stuff like add 5 second pauses, etc. However, I'm not sure which speech XML format this uses - I've tried the regular MacOS speech markup (with [[slnc 5000]] for a 5 second pause), the SAPI TTS XML (), and the SSML markup (). All of these just make it read the tag out instead of pause for 5 seconds.

p-groarke commented 2 years ago

Hey BigFrog, the xml it supports is Microsofts own https://docs.microsoft.com/en-us/previous-versions/windows/desktop/ms717077(v=vs.85)

Haven't tested it in a while though. Let me know if you have any issues with.

edit : I just noticed you tried SAPI. I'll give it a shot when I have some time to debug.

p-groarke commented 2 years ago

@BigFrogWithHat Just a quick update, it seems the xml doesn't work when calling from command line. But when reading an input text file, it's working here.

Are you experiencing the same?

edit : You can try with a text file with this xml. To use text files, wsay -i test.txt

<volume level="50">test</volume><volume level="100">test</volume>
p-groarke commented 2 years ago

OK well, I updated the readme to specify speech xml only works in text file mode. I also added a test you could run to troubleshoot https://github.com/p-groarke/wsay/blob/master/tests/data/SAPI.txt

If this still doesn't work for you, please re-open this ticket. Though it may be out of my hands (might be a windows / Microsoft issue).

Good day