LAB02-Research / HASS.Agent

Windows-based client for Home Assistant. Provides notifications, quick actions, commands, sensors and more.
https://hassagent.lab02-research.org
MIT License
1.54k stars 67 forks source link

Feature: Rhasspy integration #255

Open cvladan opened 1 year ago

cvladan commented 1 year ago

Thank you ...

I just really had an urge to praise the author because I think this is a brilliantly implemented tool based on a wonderful idea. I see that it hasn't quite "raised dust" among users yet, but I'm certain there will be many users in the coming years. It takes time for people to understand what this is all about.

I had to write this because I fear that the author will give up on improving this tool, which is becoming increasingly essential to me.

... and a suggestion

In order to not let the system or the author think that this is a generic message or SPAM, I will also provide my suggestion for improvement that may also be useful to others.

Namely, I am engaged in the Rhasspy project, which has become the official Voice assistant for Home Assistant. Now, I think that HASS.Agent could receive support for microphone, that is to say, that on some hotkey (mandatory) activate the recording of a message that would be sent to the MQTT server. That voice message shoud be sent in such a way that Rhasspy would correctly understand it as a command (Rhasspy also uses MQTT messaging for everything).

This would probably be my preferred method of voice controlling various home devices, because while I am working on my computer, I don't want to have to shout. Very often, if I am listening to music or the radio or the TV is on, Nest Mini doesn't hear me very well. This way, I could give commands by speaking into the computer I am working on.

The opposite direction, or Text-to-Speech to PC device, is already implemented, so this direction does not need to be improved.

So, just consider this option. Thanks

cvladan commented 1 year ago

There is another reason for this to be implemented. Namely, my Raspberry Pi is located in the basement, so connecting a microphone to RPi is not an option. The next best way to implement a microphone is very complicated and, in some versions, very expensive.

Me specifically, I've already ordered an M5Stack ATOM Lite ESP32 board, which has an integrated microphone, but I think most users will not want to deal with such complicated installations and I have also read that there are problems even with that.

In fact, HASS.Agent is an ideal candidate to be the "official" Smart microphone for the entire HA project, and in any case, this will greatly increase user awareness of this project.

That's all from me ;)

UPEngineer commented 1 year ago

If your request is doable, that would be freaking awesome to be able to implement. I think there are a lot of other possibilities for this integration that has yet to be discovered!

LAB02-Admin commented 1 year ago

Hi @cvladan, for starters thank you for your very kind words! It's true that this project takes a lot of time and energy, so this helps keeping motivation ☺️

And I love your suggestion and am very much willing to explore this. I've been working with speech in the past, I wrote a small tool that records text and have it parsed by Microsoft's LUIS (language interpretation), back before I started with HA.

Could you point me to documentation/info on the format etc in which rhasspy expects the recording to be delivered?

Small sidenote: this is the integration's repo, I'll move it to HASS.Agent's repo :)

[hassagent-201] -> you can ignore this, please keep responding in this topic, youtrack is for myself so I don't lose track of what I'm supposed to do 😁

cvladan commented 1 year ago

Small sidenote: this is the integration's repo, I'll move it to HASS.Agent's repo :)

[hassagent-201] -> you can ignore this, please keep responding in this topic, youtrack is for myself so I don't lose track of what I'm supposed to do 😁

Haha :smile: I must have been quite exhausted when I mistakenly wrote in the wrong repository.

cvladan commented 1 year ago

A lot of documentation is already there at Rhasspy docs.

We actually don't need wake-words, as it should be initiated via hotkey.

LAB02-Admin commented 1 year ago

Thanks, looks like I'll have to implement snips' hermes protocol. Doesn't seem too hard luckily :)

donburch888 commented 1 year ago

I have been using Rhasspy for about a year myself, and adding cheap microphones around the house is definitely a challenge (especially with RasPi shortages). The more options the better.

Namely, my Raspberry Pi is located in the basement, so connecting a microphone to RPi is not an option. The next best way to implement a microphone is very complicated and, in some versions, very expensive.

Me specifically, I've already ordered an M5Stack ATOM Lite ESP32 board, which has an integrated microphone, but I think most users will not want to deal with such complicated installations and I have also read that there are problems even with that.

cvladan, I think you should be thinking "Base + Satellite" configuration of Rhasspy. It does actually work well ... but the official documentation does not describe it well. The "Base" machine runs Rhasspy but configured to provide the more cpu-intensive services to the "Satellites" - thus no microphone connected, and can happily be located in the basement on the same machine as Home Assistant. Multiple Satellite machines also run Rhasspy, but with only audio in, audio out, and MQTT modules enabled, and all other services provided by the Base. It is helpful if the wakeword detection can also run on the satellite; otherwise (and I think this is where the ESP satellite is currently at) a constant stream of audio packets is sent from each satellite to the Base to listen for the wakeword. Mind you, this is still better than sending all the audio packets over the LAN and then out your internet connection to Google or Amazon's cloud servers ;-) As for hardware requirements; I have been happily running HAOS with Rhasspy Add-on configured as base station on my RasPi 4; and have RasPi models (zero, 3A+ and 3B) with reSpeaker HATs or a simple USB microphone running Rhasspy as satellites. Sure the RasPi Zero is noticably slower responding, but RasPi's used to be readily available and cheap enough to test out the waters on.