Home Assistant custom component that allows you to turn almost any camera and almost any speaker into a local voice assistant.
Component will use:
Assist pipeline can use:
Video instruction from fixtSE
HACS > Integrations > 3 dots (upper top corner) > Custom repositories > URL: AlexxIT/StreamAssist
, Category: Integration > Add > wait > Stream Assist > Install
Or manually copy stream_assist
folder from latest release to /config/custom_components
folder.
You can select or camera entity_id as audio (MIC) source or stream URL.
You can select Voice Assistant Pipeline for recognition process: WAKE => STT => NLP => TTS. By default componen will use default pipeline. You can create several Pipelines with different settings. And several Stream Assist components with different settings.
You can select one or multiple Media players (SND) to output audio response. If your camera support two way audio you can use WebRTC Camera custom integration to add it as Media player.
You can set STT start media for play "beep" after WAKE detection (ex: media-source://media_source/local/beep.mp3
).
Component has MIC switch and multiple sensors - WAKE, STT, INTENT, TTS. There may be fewer sensors, depending on the Pipeline settings.
The sensor attributes contain a lot of useful information about the results of each step of the assistant.
You can also view the pipelines running history in the Home Assistant interface:
You can run pipeline as a service. Almost all settings optional. But allow you to achieve customisations that are not possible in Hass by default.
service: stream_assist.run
data:
stream_source: rtsp://...
camera_entity_id: camera.xxx
player_entity_id: media_player.xxx
stt_start_media: media-source://media_source/local/beep.mp3
pipeline_id: abcdefg...
assist:
start_stage: wake_word # wake_word, stt, intent, tts
end_stage: tts
pipeline:
conversation_language: en
conversation_engine: homeassistant
language: en
name: Home Assistant
stt_engine: stt.faster_whisper
stt_language: en
tts_engine: tts.google_en_com
tts_language: en
tts_voice: None
wake_word_entity: wake_word.openwakeword
wake_word_id: None
wake_word_settings: { timeout: 5 }
audio_settings:
noise_suppression_level: None
auto_gain_dbfs: None
volume_multiplier: None
conversation_id: None
device_id: None
intent_input: None
tts_audio_output: None # None, wav, mp3
tts_input: None
stream:
file: ...
options: {}
Recommended settings for Whisper:
small-int8
or medium-int8
5
You can add remote Whisper/Piper installation from another server:
You can use Google Translate integration instead of Piper, which support many languages for TTS.
If your environment does not allow you to install add-ons, you can install Faster Whisper custom integration for local STT.