Romkabouter / ESP32-Rhasspy-Satellite

The repo has implementing an esp32 standalone MQTT audio streamer. Is is desinged to work as a satellite for Rhasspy (https://rhasspy.readthedocs.io/en/latest/). It supports multiple devices
GNU General Public License v3.0
363 stars 64 forks source link

M5 Atom Echo working as mic, not as speaker #124

Closed wimg closed 1 year ago

wimg commented 1 year ago

I successfully enabled the M5 Atom Echo as a satellite microphone and this works fine, but it's not working as a speaker. For some reason the system seems to time out on it :

[DEBUG:2023-02-17 00:11:52,991] rhasspyserver_hermes: TTS timeout will be 30 second(s) [DEBUG:2023-02-17 00:11:52,992] rhasspyserver_hermes: -> TtsSay(text='11.68 degrees', site_id='default', lang=None, id='616830af-7439-41f5-9a5d-f33d444bacc1', session_id='', volume=1.0) [DEBUG:2023-02-17 00:11:52,992] rhasspyserver_hermes: Publishing 138 bytes(s) to hermes/tts/say [DEBUG:2023-02-17 00:11:52,994] rhasspytts_cli_hermes: <- TtsSay(text='11.68 degrees', site_id='default', lang=None, id='616830af-7439-41f5-9a5d-f33d444bacc1', session_id='', volume=1.0) [DEBUG:2023-02-17 00:11:52,995] rhasspytts_cli_hermes: ['nanotts', '-v', 'en-US', '-o', '/tmp/tmp1h_r6km2.wav'] Using Lingware directory: /usr/lib/rhasspy/.venv/lib/nanotts/pico/lang read: 13 bytes from stdin using lang: en-US wrote "/tmp/tmp1h_r6km2.wav" (66732 bytes) [DEBUG:2023-02-17 00:11:53,029] rhasspytts_cli_hermes: Got 66732 byte(s) of WAV data [DEBUG:2023-02-17 00:11:53,029] rhasspytts_cli_hermes: -> AudioPlayBytes(66732 byte(s)) to hermes/audioServer/default/playBytes/616830af-7439-41f5-9a5d-f33d444bacc1 [DEBUG:2023-02-17 00:11:53,030] rhasspytts_cli_hermes: Waiting for play finished (timeout=2.334) [DEBUG:2023-02-17 00:11:53,031] rhasspyserver_hermes: Handling AudioPlayBytes (topic=hermes/audioServer/default/playBytes/616830af-7439-41f5-9a5d-f33d444bacc1, id=b1e655fc-aed7-4e13-a392-6bdeaa62871f) [WARNING:2023-02-17 00:11:55,366] rhasspytts_cli_hermes: Did not receive playFinished before timeout [DEBUG:2023-02-17 00:11:55,369] rhasspytts_cli_hermes: -> TtsSayFinished(site_id='default', id='616830af-7439-41f5-9a5d-f33d444bacc1', session_id='') [DEBUG:2023-02-17 00:11:55,369] rhasspytts_cli_hermes: Publishing 84 bytes(s) to hermes/tts/sayFinished [DEBUG:2023-02-17 00:11:55,373] rhasspyserver_hermes: Handling TtsSayFinished (topic=hermes/tts/sayFinished, id=b1e655fc-aed7-4e13-a392-6bdeaa62871f) [DEBUG:2023-02-17 00:11:55,373] rhasspydialogue_hermes: <- TtsSayFinished(site_id='default', id='616830af-7439-41f5-9a5d-f33d444bacc1', session_id='') [ERROR:2023-02-17 00:12:19,928] rhasspydialogue_hermes: Session timed out for site satellite1: satellite1-jarvis_linux-b99cd2fb-bd39-400c-9745-4468bdf2d11e [DEBUG:2023-02-17 00:12:19,929] rhasspydialogue_hermes: -> AsrStopListening(site_id='satellite1', session_id='satellite1-jarvis_linux-b99cd2fb-bd39-400c-9745-4468bdf2d11e') [DEBUG:2023-02-17 00:12:19,929] rhasspydialogue_hermes: Publishing 101 bytes(s) to hermes/asr/stopListening [DEBUG:2023-02-17 00:12:19,932] rhasspydialogue_hermes: -> DialogueSessionEnded(termination=DialogueSessionTermination(reason=<DialogueSessionTerminationReason.TIMEOUT: 'timeout'>), session_id='satellite1-jarvis_linux-b99cd2fb-bd39-400c-9745-4468bdf2d11e', site_id='satellite1', custom_data='jarvis_linux') [DEBUG:2023-02-17 00:12:19,933] rhasspydialogue_hermes: Publishing 169 bytes(s) to hermes/dialogueManager/sessionEnded

My rhasspy config : { "dialogue": { "satellite_site_ids": "satellite1", "system": "rhasspy" }, "handle": { "satellite_site_ids": "satellite1", "system": "hass" }, "home_assistant": { "access_token": "xxxxxx", "handle_type": "event", "url": "http://192.168.0.xxx:8123/" }, "intent": { "satellite_site_ids": "satellite1", "system": "fsticuffs" }, "microphone": { "system": "hermes" }, "mqtt": { "enabled": "true", "host": "192.168.0.5" }, "sounds": { "system": "hermes" }, "speech_to_text": { "satellite_site_ids": "satellite1", "system": "kaldi" }, "text_to_speech": { "satellite_site_ids": "satellite1", "system": "nanotts" }, "wake": { "porcupine": { "keyword_path": "jarvis_linux.ppn" }, "satellite_site_ids": "satellite1", "system": "porcupine" } }

My M5 Atom Echo settings.ini : [General] hostname=192.168.0.9 deployhost=192.168.0.9 siteId=satellite1 device_type=0 network_type=0

[MQTT] hostname=192.168.0.5 port=1883

What am I doing wrong ?

Romkabouter commented 1 year ago

I see the M5 is called satellite1, but all the messages are for "default" which is probably your Rhasspy server.

Like this first one: [DEBUG:2023-02-17 00:11:52,992] rhasspyserver_hermes: -> TtsSay(text='11.68 degrees', site_id='default', lang=None, id='616830af-7439-41f5-9a5d-f33d444bacc1', session_id='', volume=1.0)

Do you type something in the webUI of you server? That does not work for a satellite, unless you give the sat and server the same siteId. Which ideally, you do not want.

wimg commented 1 year ago

The satellite is called satellite1, the server is called default. This is what the config looks like : image

To be clear : this specific reply came from HomeAssistant. I'm asking HA for the temperature and it does a callback to /api/text-to-speech, but I guess that's where the problem is : how does Rhasspy know which satellite to send it to ? Can I speciy that in the text-to-speech API call ?

I was basing everything on this, but I think something is missing : image

Romkabouter commented 1 year ago

I'm asking HA for the temperature and it does a callback to /api/text-to-speech, but I guess that's where the problem is : how does Rhasspy know which satellite to send it to ? Can I speciy that in the text-to-speech API call ?

How do you handle this in HA? Because this is the issue, it does a TtsSay to your server, not the satellite. You should add ?siteId=satellite1 to your API call.

Maybe it is even better to switch to MQTT, with help from this wiki: https://github.com/rhasspy/rhasspy/wiki

wimg commented 1 year ago

I did figure it out. I will add my solution. It does indeed involve using the ?siteId=satellite1 in the API call to /api/text-to-speech?siteId=satellite1

How to do this in Home Assistant is a little tricky, since it means you need to build the URL dynamically. So here goes : First add this to your configuration.yml input_text: rest_rhasspy_satellite_id: name: id of the satellite to send to

Next change your rest_command to : rhasspy_speak: url: "http://192.168.0.x:12101/api/text-to-speech?siteId={{ states('input_text.rest_rhasspy_satellite_id') }}" method: 'POST' payload: '{{payload}}' content_type: text/plain

Finally, in the automation you're using, for example this one to retrieve the temperature :

wimg commented 1 year ago

1 quick note though : if you have lots of them and you have multiple commands being uttered on multiple satellites at the same time, you might end up with a race condition here, causing the response to a command on satellite X to go to satellite Y ;-)

Romkabouter commented 1 year ago

second note: your session will not end properly this way.

wimg commented 1 year ago

Hmmm thanks for mentioning that, because now I discovered this also works :

service: mqtt.publish data: topic: hermes/dialogueManager/endSession payload_template: >- {"sessionId": "{{ trigger.event.data._intent.sessionId }}", "text": "{{ states.sensor.temphum_sensor_outside_temperature.state }} degrees"}

(or anything else that needs returning of course)

Romkabouter commented 1 year ago

yup, that is why I mentioned the MQTT and pointed to the wiki

wimg commented 1 year ago

Thanks, I guess I picked up the wrong thing.