dermotduffy / frigate-hass-card

A Lovelace card for Frigate in Home Assistant
MIT License
470 stars 52 forks source link

Tie microphone button to a specific stream & only open stream when activated #1445

Open esand opened 1 month ago

esand commented 1 month ago

This is a feature request to enhance how 2-way audio and the microphone would work using the frigate-card.

What I am suggesting would be that if a user enables the microphone button in the menu, that it also asks for the sub-stream specific to using the 2-way audio. It could be the same stream as the camera view, or it could be a separate stream (allow for whatever configuration options are necessary such as go2rtc stream names, etc..).

Having the microphone button tied to a specific stream would allow that stream to be audio-only and could be opened/closed only when the microphone button is activated. Hopefully it's possible such that the card requests microphone access from the browser/client but doesn't have to attach the 2-way audio stream at that time. When the mic button is activated though, it would dynamically bind the 2-way audio stream, and unbind it once the mic button is deactivated. This would allow you to request microphone access as soon as the view is opened, but only open any backchannel audio streams when the mic feature is needed.

This feature is something I mentioned in #1235 and I believe may be a viable solution to a number of little issues people are having with setting up 2-way audio in an easy to manage/use manner in Home Assistant.

Criticism welcome if there's reasons this wouldn't work in all secnarios.

dermotduffy commented 1 month ago

What I am suggesting would be that if a user enables the microphone button in the menu, that it also asks for the sub-stream specific to using the 2-way audio.

Just to confirm, you are suggesting a configurable microphone stream, that is only used when the microphone connection is activated?

Having the microphone button tied to a specific stream would allow that stream to be audio-only and could be opened/closed only when the microphone button is activated. Hopefully it's possible such that the card requests microphone access from the browser/client but doesn't have to attach the 2-way audio stream at that time. When the mic button is activated though, it would dynamically bind the 2-way audio stream, and unbind it once the mic button is deactivated. This would allow you to request microphone access as soon as the view is opened, but only open any backchannel audio streams when the mic feature is needed.

Interesting idea! I do plan to revisit 2-way audio in general, since some people are struggling with it -- but didn't quite plan what to do yet. What you're describing sounds similar to always_connected as it exists today, in that the microphone stream is just always connected to the microphone. When the button is pressed, it's unmuted, when it's let go it's re-muted. How is what is described better that what already exists?

esand commented 1 month ago

Just to confirm, you are suggesting a configurable microphone stream, that is only used when the microphone connection is activated?

Yes. Since a number of people have issues where opening the backchannel somehow blocks features on the device. In the case of doorbells which seem to be the most common, some models block the button press from working while backchannel is active/connected.

Interesting idea! I do plan to revisit 2-way audio in general, since some people are struggling with it -- but didn't quite plan what to do yet. What you're describing sounds similar to always_connected as it exists today, in that the microphone stream is just always connected to the microphone. When the button is pressed, it's unmuted, when it's let go it's re-muted. How is what is described better that what already exists?

Ok, hopefully I can explain my thought process a bit better...

With a dedicated microphone stream configured, you know what stream to use (open/connect to) when 2-way audio is required. For cases where the 2-way audio needs to be disconnected, it's easy to figure out that you simply close that stream, but not the main camera stream used for viewing (and possibly regular play-back audio).

Currently, you can only configure a single stream for a camera, which in the case of requiring 2-way audio means that the 2-way audio stream is being consumed at all times when that camera is in view - there's no way to drop the 2-way audio but keep the video feed. This is the first part of the issue that my proposal is hoping to improve.

The second half is how and when 2-way audio is used, and that touches on the always_connected feature. Presently, in order for 2-way audio to work, you need two things - the microphone permission in the client browser/app, and the stream to connect to. Having always_connected: true is definitively easier to use since the mic button doesn't require extra presses and works as you'd expect it to work - but it currently means the 2-way steam is connected right away, even if you don't plan to speak; as soon as the camera video feed is opened, so is the 2-way audio. This is where the issue comes in with common doorbells; as soon as there's a consumer for the backchannel, some features stop working on the device.

My proposal hinges on the possibility of establishing microphone access in the client browser/app without needing to open the 2-way stream. If this is possible, you could effectively hard code always_connected: true since if you've enabled the mic in the menu (with a stream as per part one of my idea), obviously they want the ability to speak - but you just don't know if/when yet.

Once the mic button is activated, you would establish the connection between the microphone and the 2-way stream. Since the person has activated the mic, they want to speak - so let them. As soon as they are done, they deactivate the mic button and you close the stream connection. This releases the consumer and brings the device back to normal operation.

Since the 2-way stream would only be consumed (not just muted!) ad-hoc when the mic is going to be used, it would limit functionality issues on these devices to only when the person is trying to communicate through 2-way audio. Also, since you'd know what specific 2-way stream to use independently of the video, you could disconnect the 2-way audio feature (for timeouts or whatever purpose) without any impact to the video stream for the device which could remain active.

I think this would allow people to set up "doorbell" views in Home Assistant where they can have the view active, watching the video, and then only when necessary, activate 2-way audio to speak to people. As soon as they're done talking, doorbell presses would work again and everything would be fine; it all hinges on releasing the backchannel consumer - nothing to do with microphone access and muting.

Does that explain it better?

esand commented 1 month ago

A final bit of extra info...

Presently, to set up a doorbell view in HA would require 2 views. One that uses video only for its stream, no microphone access and is used just to watch the video feed in case someone is approaching.

You would then need a secondary view that enables microphone and binds to the 2-way audio backchannel so that you can use the microphone to communicate with someone - but you cannot activate this view if the approaching person has not yet pressed the doorbell and is about to.

If the approaching person needs to ring the doorbell - there cannot be any backchannel consumers. As soon as there's something consuming that 2-way audio stream, doorbell presses do not work and the doorbell goes in to a communications mode that limits functionality.

You have to wait for the person to ring the doorbell, and only then can you go to the secondary 2-way audio view, which lets you use the microphone to communicate with them. When you are done, you need to close that view (and/or whatever else is necessary) to disconnect the backchannel consumer so that the doorbell reverts back to normal operation, allowing other doorbell presses. Failure to do so keeps the backchannel open, blocking all other doorbell presses from even registering. So you can see the issue here... if nothing times out that backchannel connection, you could inadvertently keep it open for days and people trying to press the doorbell to alert you of their presence would have no effect at all (the doorbell doesn't ring).

Does this sound like a horrible design flaw for the doorbell? Absolutely. Sadly though, they seem to be fairly common, and at least for those of us with them, I'm hoping that some changes can make our lives a bit easier. This should have no negative impact to anyone with a properly functioning device, so it should be a win-win.