brutella / hkcam

Open-Source HomeKit Surveillance Camera
https://hochgatterer.me/hkcam/
Apache License 2.0
920 stars 141 forks source link

Support for two-way audio #134

Open nanosonde opened 1 year ago

nanosonde commented 1 year ago

Homekit also supports video doorbells with two-way-audio. The "only" thing missing here is the return audio from Homekit to the Raspberry Pi. For this the speaker service must be added to the camera services.

Then we could use another ffmpeg instance to decode the audio from Homekit to the ALSA speaker device. I already did this here: https://github.com/nanosonde/homebridge-camera-ffmpeg/blob/sipdoorbell/src/streamingDelegate.ts#L516-L574

(The code above connects to a negotiated SIP RTP endpoint. This would not be required here as we could directly access the ALSA audio device.)

brutella commented 1 year ago

Does your code also work with audio from the Raspberry Pi to HomeKit?

nanosonde commented 1 year ago

I have not tried this use-case yet as I have Wantec Monolith C IP Video SIP doorbell.

However, I am currently working on something completely new. This could be also interesting for you. I have created some Python script which uses gstreamer to do video and two-way audio towards homekit. This video and audio in/out could be anything. It is already working with videotestsrc/audiotestsrc or rtspsrc gstreamer elements. The audio output could be any sink. For testing I am currently using a filesink for writing uncompressed WAV files.

My plan is to offer a solution which pulls the streaming and transcoding handling (FFMPEG) completely out of any Homekit software package. Instead my Python based solution is supposed to run in a docker container (for example) and offers a REST API that is basically connected to PrepareStream/StartStream/StopStream in the corrsponding SW package (homebridge, ...) For testing this I am using a modified version of the homebridge-unifi-protect plugin. Here I have completely ripped out the Unifi stuff. Instead the controller now is an instance of my gstreamer Python script with REST API.

Concerning two-way audio support I plan to support RTSP backchannel for the speaker output. Gstreamer already offers this for the client and for the server. The gstreamer Python script (with REST API) will be an RTSP client with backchannel support that could connect to two-way capable video cameras.

To also support my use-case with the SIP-camera, I plan to also have a small gstreamer Python script which basically is a RTSP server with backchannel support on one side and a SIP client on the other side. This way I could easily negotiate a two-way RTP session between the SIP-camera and this gstreamer script based on the INVITE SDP and response SDP.

The good thing about using gstreamer for homekit streaming is that it supports so much more stuff related to the inital homekit requirements concerning RTP/RTCP and the SRTP/SAVPF profile.

Ah and by the way: whilte taking a deep-dive into gstreamer internals and reading parts of the related RFC I was also able to get full OPUS audio codec support. For this I had to patch the gstreamer sources. The idea of the patch would also be working for the FFMPEG sources, I am pretty sure as I have studied related code in FFMPEG. So it would NOT be required anymore to use the AAC ELD codec from libfdk.