alarm clock problem with own stt server + how to speak?

SEPIA-Framework / sepia-docs

Documentation and Wiki for SEPIA. Please post your questions and bug-reports here in the issues section! Thank you :-)

https://sepia-framework.github.io/

236 stars 16 forks source link

alarm clock problem with own stt server + how to speak? #104

Closed royrogermcfreely closed 2 years ago

royrogermcfreely commented 3 years ago

hey,

when i use my own stt server and want to set an alarm from the browser, sepia knows what i'm saying but keeps me asking about the time. this doesn't happens when i use the native asr engine from my tablet.

timer works with native and custom stt server.

is it possible to publish via mqtt something when the timer/alarm is set/finished? cause i have a nice automation from Home Assistant where the light dims, coffee maker start and radio starts.

2.) how can i send text to the tts server with states from home assistant sensors?

fquirin commented 3 years ago

Hi @royrogermcfreely ,

I'm afraid this issue with the SEPIA STT server is because numbers are not transformed correctly in every case :-(. This is an issue that exists for a long time now and I've decided to postpone the fix until the new audio library is implemented which will hopefully be in a few weeks.

2.) how can i send text to the tts server with states from home assistant sensors?

Do you want SEPIA to speak the results or do you just want to use the TTS server? The first case is possible but a bit tricky because it uses the remote-action for audio playback. It will be easier in the next version. For the second case there is a REST-like API for TTS

royrogermcfreely commented 3 years ago

ok thanks for the info. the same problem is when i ask the weather and its for example -4° - SEPIA tells me "its minus degree cold"

kann you give me a example for lets say "good morning roy"? i'm not familiar with that. can i put it in one line to call?

fquirin commented 3 years ago

when i ask the weather and its for example -4° - SEPIA tells me "its minus degree cold"

Oh, which voice was that? The default "robotic" eSpeak voice of the DIY client or maybe some MaryTTS voice?

kann you give me a example for lets say "good morning roy"?

I'm assuming you want to use the Home Assistant RESTful Command for this? I have never tried this but here is a wild guess according to Home Assistant docs:

rest_command:
  my_request:
    url: http://[assist-server-host]/tts
    method: POST
    content_type: "application/x-www-form-urlencoded"
    payload: "text=Good%20morning%20Roy&lang=en&voice=default&gender=default&mood=5&client=home_assistant&GUUID=USER_ID&PWD=PASSWORD"

Make sure to replace [assist-server-host] with your SEPIA Server IP and Assistant PORT (e.g. 192.168.178.88:20721 or localhost:20721 or my-domain/sepia/assist etc.) and USER_ID with something like uid1007 and PASSWORD with your password in clear text. Instead of GUUID and PWD you can use KEY later (a temporary token instead of clear text password!).

This should give you an answer containing a link to a WAV file. Lets try this first and then see how we can proceed ;-)

royrogermcfreely commented 3 years ago

i adapted your code to

  sepia_speak:
    url: http://192.168.0.16:20721/tts
    method: put
    headers:
      content_type: "application/x-www-form-urlencoded"
    payload: "text=Good%20morning%20Roy&lang=en&voice=default&gender=default&mood=5&client=a1&GUUID=uid1007&PWD=SUPERSAVE"

Does the client-id in the code has to match the client-id from the device who should speak? In Home Assistant i get this error:

 Logger: homeassistant.components.rest_command
Source: components/rest_command/__init__.py:136
Integration: RESTful Command (documentation, issues)
First occurred: 7:43:53 PM (1 occurrences)
Last logged: 7:43:53 PM
Error. Url: http://192.168.0.16:20721/tts. Status code 404. Payload: b'text=Good%20morning%20Roy&lang=en&voice=default&gender=default&mood=5&client=a1&GUUID=uid1007&PWD=SUPERSAVE'

doesnt say much for me, and the "b" at the payload i dont know whats about

maybe for the last steps i have to ask in the homeassistant community for help

Oh, which voice was that? The default "robotic" eSpeak voice of the DIY client or maybe some MaryTTS voice?

dont know anymore since is so hot^^ maybe i will come back to that when it gets colder ;)

fquirin commented 3 years ago

Does the client-id in the code has to match the client-id from the device who should speak?

In this case it is only related to the login and has basically no effect because you use GUUID and PWD anyway instead of KEY. If you use KEY the token is bound to the client ID used to authenticate and obtain the token.

doesnt say much for me, and the "b" at the payload i dont know whats about

I think that just indicates the string encoding or something. I bet its a Python quirk :-p

sepia_speak: url: http://192.168.0.16:20721/tts method: put

Here you used 'put' instead of 'post'. This is expected to fail ^^.

dont know anymore since is so hot^^ maybe i will come back to that when it gets colder ;)

:grin: we'll try to remember for later ^^

royrogermcfreely commented 3 years ago

damn me ^^

so i can call the rest command without errors but no device speaks. there are 3 devices wherei am logged in with the uid1007 account - tablet (mainly used), laptop (browser) and smartphone does this matter?

here is the log:

2021-07-17 13:39:10 DEBUG (MainThread) [homeassistant.core] Bus:Handling <Event call_service[L]: domain=rest_command, service=sepia_speak, service_data=> 2021-07-17 13:39:10 DEBUG (MainThread) [homeassistant.components.rest_command] Success. Url: http://192.168.0.16:20721/tts. Status code: 200. Payload: b'text=Good%20morning%20Roy&lang=en&voice=default&gender=default&mood=5&client=a1&GUUID=uid1007&PWD=SUPERSAVE' 2021-07-17 13:39:10 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection] [139766837034816] Sending {"id": 22, "type": "result", "success": true, "result": {"context": {"id": "2351e1a3e5efb77568403291f38df354", "parent_id": null, "user_id": "342e2661444743dea3805f7d12c9db5b"}}}

fquirin commented 3 years ago

so i can call the rest command without errors but no device speaks.

The command described above is the TTS server request that will create an URL to the voice audio file :grimacing: . I'm not sure if Home Assistant can use this in a 2nd step maybe :thinking: to play audio. Unfortunately the "broadcast" feature that will target a specific SEPIA client device ID and play a certain text on this device is not yet available but will be part of the upcoming version. It is possible to call another command (very similar to the first) to send the URL from step one to SEPIA to play it like a music stream. That works in theory but has the drawback that the audio will fade in (as it is default for music) which might cut some of your text.