Closed hamishcunningham closed 1 year ago
Nice! @stintel will be back soon and we'll take a deeper look!
We've decided to not accept this PR until after merging the feature/was branch.
We've decided to not accept this PR until after merging the feature/was branch.
I vaguely think this branch is now merged, so would it please be possible to get this functionality in? I would really like to be able to use willow and WIS but have the "brains" be code that I write. So basically if a user says "Hi ESP", the audio should get streamed to whisper in WIS, which forwards the text to my REST endpoint. Then I would like to be able to reply with some text and that magically gets converted to audio via TTS, and then gets played on the ESP32 Box. Is this possible? It would be awesome. Thank you very much.
I vaguely think this branch is now merged, so would it please be possible to get this functionality in?
We indeed just merged the feature/was branched and tagged 0.1.0-rc.1. For now, we will solely focus on handling issues found in this release candidate, so new features will have to wait just a bit longer.
I think an improvement here would be the ability to have separate strings for text
and speech
, as sometimes I want a long response spoken but a short one shown. If the text
response is missing, showing the speech
string on the screen would probably be a reasonable default.
We created a release/v0.1 branch, so we are accepting new features in main again. Can you fix the conflicts in the PR?
@stintel is it just formatting changes that are required here?
I could try to clone this PR to add the "text" and "speech" keys, as proposed, though my C++ isn't great.
@stintel is it just formatting changes that are required here?
No. There are conflicts as shown on the bottom of this PR page.
sorry to be slow, I'm pretty busy at present; I'll try and get back to this soon!
Ah that's no problem, @hamishcunningham, I was just wondering whether it existed and I'd missed it.
We have merged our own variant of this functionality and it is included in the current Willow release candidate.
@kristiankielhofner is there any documentation on this? I've been trying to find it in the docs but no luck so far.
Looking in the code it looks like you just need a speech element in the json reply with the text to speak.
@nikito that works, thanks, but it would be good to have docs on how REST works, in general. Also, I thought there are different elements for speaking and for displaying, is that not the case?
Enable TTS audio response from REST endpoints by checking for a message field in POST responses and passing it to the audio response fn_ok.
Add a configuration option to specify the maximum length of the text to send to TTS in this way.
Background: for HomeAssistant the response can be read out on the ESP BOX. To enable external REST APIs to do that we need to add a little code to
rest.c
(following the way it is done inhass.c
).