Sunoo / homebridge-camera-ffmpeg

Homebridge Plugin Providing FFmpeg-based Camera Support
https://sunoo.github.io/homebridge-camera-ffmpeg/
Apache License 2.0
1.09k stars 227 forks source link

Two-way audio based on SIP #928

Open nanosonde opened 3 years ago

nanosonde commented 3 years ago

@Sunoo @longzheng After reading the two-way audio issue, I think it is worth to create a seperate follow-up issue which only refers to video doorbells that provide two-way audio based on SIP. (See former discussion here: https://github.com/Sunoo/homebridge-camera-ffmpeg/issues/738#issuecomment-680557506)

Example devices:

Video is very often just implemented as a MJPEG or H.264 stream via HTTP/RTSP. I guess there are also SIP video doorbells which offer video as part of the SIP session. In the former case, the video is normally completely seperate from the audio part via SIP/RTP. In the later case, I would assume that the video+audio uses SIP early media feature to show video+audio before actually picking up the call (during SIP RINGING).

I think that the homebridge-ring plugin could probably serve as a good starting point as it shows how to implement the SIPclient based on @kirm sip.js lib. It should be easy to then extract the relevant SDP from the SIP INVITE media negotiation.

Remark: Of course there are a few SIP apps out there which could also somehow cover the use case and also offer Apple VoIP notification feature (like linphone) to receive calls even if the app is not in the foreground. However, this feature request is to use SIP video doorbells as Homekit video doorbells without any additional app. Only homebridge will talk to the SIP video doorbell.

github-actions[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Sunoo commented 3 years ago

I’m really not sure if SIP is appropriate for this plugin. If it was to be added, it would be a ways down the line.

I also don’t have any cameras that support SIP, so development and testing would be a bit difficult.

nanosonde commented 3 years ago

After investing a bit more and looking at various Homekit projects, I came to the conclusion that it is really out of scope of this plugin.

So I think I will go this way:

Do you think that this is feasible?

Sunoo commented 3 years ago

Seems like it should be yes. If you write up your results, I'll be happy to point people towards that if they want to do a similar thing.

nanosonde commented 3 years ago

Ok, I will close it for now.

nanosonde commented 3 years ago

Ah, sorry. One question that should fit in the scope of this plugin.

Could you please provide a "returnAudioTarget" command line for FFMPEG which just sends the the FFMPEG output to a local sound card?

Sunoo commented 3 years ago

Sure, I have one in my notes, I'll dig it up this afternoon.

nanosonde commented 3 years ago

@Sunoo In the mean time I have read a lot about ffmpeg/gstreamer RTP, SRTP with AVP(F)/SAVP(F), ALSA, Pulseaudio, SIP, baresip and so on.

My approach using ALSA loopback device and some SIP client (I use baresip) seems to be quite promising during my first experiments.

I have used this ffmpeg config as a starting point:

ffmpeg -f mjpeg -r 15 -i http://192.168.10.22:8080/?action=stream \
 -f alsa -i hw:1,1\
 -vcodec libx264 -x264-params keyint=25:min-keyint=25 -f rawvideo -preset ultrafast -tune zerolatency -payload_type 99 -ssrc 16132552 -an -sn -dn -flags global_header \
 -f rtp "rtp://192.168.10.101:58536?rtcpport=58537&localrtcpport=58537&localrtpport=58536&pkt_size=1316" \
 -acodec libfdk_aac -profile:a aac_eld -flags +global_header -payload_type 100 -ssrc 17132553 -ar 48000 -ac 2 -vn -sn -dn \
 -f rtp "rtp://192.168.10.101:58538?rtcpport=58539&localrtcpport=58539&localrtpport=58538&pkt_size=1316" \

I have loaded the ALSA loopback module: sudo modprobe snd-aloop

Now I have installed baresip-core in Ubuntu. In the baresip config under ~/.baresip/config I setup the audio config like this:

audio_player        alsa,hw:1,0
audio_source        alsa,hw:1,0
audio_alert     alsa,hw:1,0
ausrc_srate     48000
auplay_srate        48000
ausrc_channels      2
auplay_channels         2

So baresip will play and record the SIP audio to the one and only loopback device. It is full-duplex, so it will work simultaneously with playback and capture.

The ffmpeg command line from above received what is played from baresip on hw:1,1 and streams it via RTP. BTW: I use VLC without SRTP for testing at the moment.

To test "return audio" I played an MP3 file: mpg123 -a hw:1,1 test.mp3 The doorbell gave back the audio without any problems.

So what that all mean to this plugin? What I would require is a PRE and POST hook before/after the FFMPEG invocation to be able to setup everything and take it down again. Especially when using ALSA loopback it is important that "problematic programs" open the loopback device FIRST so that they can freely setup sample rate, number channels, sample format, etc. Another user of the loopback device -in our case ffmpeg - would have to "live" with the configured settings. However, this is not a problem for ffmpeg as it can convert audio to whatever is required.

Do you think you could add some pre/post script execution config commands that get executed?

Sunoo commented 3 years ago

This is not the first use case that’s come to me that could use pre or post execution jobs. I have some ideas as to how to implement that somewhat cleanly. I’ll probably work on it after Christmas. There is another version I need to push before I dive into that, but that one shouldn’t be too hard.

Sunoo commented 3 years ago

@nanosonde Would something like one of these options resolve your use case? https://github.com/Sunoo/homebridge-camera-ffmpeg/issues/929#issuecomment-782941672

I’m still giving some thoughts on how best to handle this sort of thing.

nanosonde commented 3 years ago

@Sunoo

I have read your suggested options.

The issue we should consider for option 2 and 3 is that we need some kind of handshake BEFORE the actual ffmpeg process is started. This is required because it shall be possible to setup loopback devices that ffmpeg shall use when started afterwards. So the plugin would have to wait for some ACK, before it proceeds to start ffmpeg. Maybe with some default timeout in case the external script does not work properly.

If I would have to choose, I would go with MQTT instead of HTTP. I have a broker running anyway. I guess that people who need an advanced setup with external scripts should be able to handle the broker requirement.

nanosonde commented 3 years ago
@startuml
Plugin -> Script: Request Prepare Resource
Script--> Plugin : ACK

Plugin --> Plugin : Use resource (e.g. audio device as input device in ffmpeg)

Plugin -> Script: Request Shutdown Resource
Script--> Plugin : ACK
@enduml

grafik

Sunoo commented 3 years ago

Hmm, good point on the ACK, hadn’t thought about that. There would probably have to be a fairly short timeout on waiting on a response from the script if I did wait for a response.

Also, some scripts that hook into this probably would likely have no reason to delay the stream, but I suppose either a configuration setting or just documentation that they should ACK immediately in that case should solve that.

Just thinking out loud a bit, but if it would just be two way audio that would need the ACK, I wonder if there is a reasonable way to start sending return audio towards your script and just have it pick that up and start working with it as soon as possible. This would have the best user experience, since loading the video wouldn’t be delayed, it may just take a second or two for return audio to start working after it loads. Configuring the two way audio setting in the plugin to point at a FIFO or something could be the solution for that.

nanosonde commented 3 years ago

What I would like to do is use an existing SIP command line client to handle the SIP communication with two-way audio. As I do not want to maintain the SIP part.

If the PrepareResourceRequest comes in, I would like to start the SIP command line client,return immediately and send the ACK towards the plugin. This will make sure that it already grabbed the corresponding ALSA devices. In parallel the the SIP client already starts initiating the SIP call which the plugin can already start the ffmpeg processing to get the video stream as early as possible.

So the timeout could really be fairly small I think. Just enough time to start another linux process which opens some device files.

nanosonde commented 3 years ago

BTW: I am not sure if we need an ACK during shutdown. It will be called AFTER the ffmpeg process is finished. So no delay is necessary here.

Sunoo commented 3 years ago

I totally understand not wanting to maintain a SIP implementation, that's not my idea of fun. I'm going to keep thinking on this. Perhaps trying to come up with a one-size-fits-all solution as I have been isn't worth it. Though delaying video until the ACK also allows for the potential need to set some process up to pull video from as well.

I'm not sure how patient HomeKit is when waiting for frames, I'll have to do some testing at some point (might be the same ~22 seconds that it waits for snapshots). Obviously any delay negatively impacts user experience though, and should be avoided where possible.

Also, I agree, no ACK is likely required on shutdown, as there is no reason (and in many cases, no ability) to delay stopping the stream.

spbroot commented 3 years ago

Maybe someone will be useful. I use this option to implement two-way audio with SIP intercom https://github.com/spbroot/sipdoorbell (Homebridge-camera-ffmpeg + Baresip + ALSA loopback).

nanosonde commented 3 years ago

Maybe someone will be useful. I use this option to implement two-way audio with SIP intercom https://github.com/spbroot/sipdoorbell (Homebridge-camera-ffmpeg + Baresip + ALSA loopback).

Which SIP intercom are you using?

Sunoo commented 3 years ago

@spbroot Nice, if you share how you did your setup somewhere, I can add the instructions to the project site.

nanosonde commented 3 years ago

@spbroot Nice, if you share how you did your setup somewhere, I can add the instructions to the project site.

Couldn't the HTTP calls to answer and hangup be executed as part of the pre- and post-hooks when running the FFMPEG process?

So I think it is enough if the camera-ffmpeg plugin just calls a webhook before and after executing FFMPEG without waiting for a reply.

The audio loopback device can always be opened by FFMPEG. The video stream is always there for most SIP intercoms where the video is independent from the audio part.

spbroot commented 3 years ago

Maybe someone will be useful. I use this option to implement two-way audio with SIP intercom https://github.com/spbroot/sipdoorbell (Homebridge-camera-ffmpeg + Baresip + ALSA loopback).

Which SIP intercom are you using?

Hi. I am using an analog intercom and a SIP converter is connected to it.

spbroot commented 3 years ago

@spbroot Nice, if you share how you did your setup somewhere, I can add the instructions to the project site.

Ok, I will do it.

spbroot commented 3 years ago

@spbroot Nice, if you share how you did your setup somewhere, I can add the instructions to the project site.

Couldn't the HTTP calls to answer and hangup be executed as part of the pre- and post-hooks when running the FFMPEG process?

So I think it is enough if the camera-ffmpeg plugin just calls a webhook before and after executing FFMPEG without waiting for a reply.

The audio loopback device can always be opened by FFMPEG. The video stream is always there for most SIP intercoms where the video is independent from the audio part.

Hi. I would also like to add SIP call control only through the Homekit functionality, but for this I need to come up with an interaction with the plugin. At first I had the idea to track the creation of an FFMPEG process that is launched in the system with the parameters of my device, but this is not an option for me, because I have several Apple TVs in my house that notify about the doorbell, and they request video, thereby initiating the start of the FFMPEG process when the doorbell rings. I think the best option is to establish a SIP connection when you press the TALK button, and disconnect it when you press it again. So far, it has only been possible to implement this through parsing Homebridge logs when "Two-way FFmpeg Debug Logging" is enabled, but this is a bad solution. (I have updated the information with a new script that does this).

I think if the functionality of executing external scripts will be added in the future plugin, it would be nice to add the execution of external commands when the TALK button is pressed and disabled (if possible). It would also be nice to access the plugin via HTTP indicating the device (something like http: // homebridge: 8080 / status? Doorbell) and receive a response with the status: is the TALK button pressed, etc. and everyone will be able to parse the parameters and state they need.

mrMiimo commented 1 year ago

@Sunoo can we have a version (pull request still open) that supports SIP calls ?, or how do I install the fork that contains it? thnks a lot!

Sunoo commented 1 year ago

I’ll try to get that merged soon, I can’t truly test it myself, but I suppose it must work.

longzheng commented 1 year ago

I’ll try to get that merged soon, I can’t truly test it myself, but I suppose it must work.

Is there a SIP doorbell you'd be interested in installing? Maybe we can chip in one for you.

Sunoo commented 1 year ago

I’d be open to installing one, not sure what’s even available as far as those go to be honest.

mrMiimo commented 1 year ago

I’ll try to get that merged soon, I can’t truly test it myself, but I suppose it must work.

is there a way I can test it? maybe if you can create a branch ...

stephanlinke87 commented 1 year ago

Willing to test this! :) I do have a doorbird that can initiate sip calls upon ringing. Their API documentation has the SIP stuff starting at page 33 :)

edelmaca commented 1 year ago

This one would be really nice. I'm using a 2N IP Verso 2. That is capable of initiating SIP calls as well.

VCTGomes commented 1 year ago

Well, I finally connected successful my Hikvision doorbell with SIP to Baresip. It's registering and even ringing (I just test the indoor panel to phone).

However if I try to open the live video on HomeKit, I got the following error:

[8/3/2023, 11:22:11 PM] [Camera FFmpeg] [Doorbell] [h264 @ 0x562a76a025c0] [error] non-existing PPS 0 referenced
[8/3/2023, 11:22:11 PM] [Camera FFmpeg] [Doorbell] [h264 @ 0x562a76a025c0] [error] non-existing PPS 0 referenced
[8/3/2023, 11:22:11 PM] [Camera FFmpeg] [Doorbell] [h264 @ 0x562a76a025c0] [error] decode_slice_header error
[8/3/2023, 11:22:11 PM] [Camera FFmpeg] [Doorbell] [h264 @ 0x562a76a025c0] [error] no frame!
[8/3/2023, 11:22:11 PM] [Camera FFmpeg] [Doorbell] [h264 @ 0x562a76a025c0] [error] non-existing PPS 0 referenced
[8/3/2023, 11:22:11 PM] [Camera FFmpeg] [Doorbell] [h264 @ 0x562a76a025c0] [error] non-existing PPS 0 referenced
[8/3/2023, 11:22:11 PM] [Camera FFmpeg] [Doorbell] [h264 @ 0x562a76a025c0] [error] decode_slice_header error
[8/3/2023, 11:22:11 PM] [Camera FFmpeg] [Doorbell] [h264 @ 0x562a76a025c0] [error] no frame!
[8/3/2023, 11:22:11 PM] [Camera FFmpeg] [Doorbell] [h264 @ 0x562a76a025c0] [error] non-existing PPS 0 referenced
[8/3/2023, 11:22:11 PM] [Camera FFmpeg] [Doorbell] [h264 @ 0x562a76a025c0] [error] non-existing PPS 0 referenced
[8/3/2023, 11:22:11 PM] [Camera FFmpeg] [Doorbell] [h264 @ 0x562a76a025c0] [error] decode_slice_header error
[8/3/2023, 11:22:11 PM] [Camera FFmpeg] [Doorbell] [h264 @ 0x562a76a025c0] [error] no frame!
[8/3/2023, 11:22:11 PM] [Camera FFmpeg] [Doorbell] [h264 @ 0x562a76a025c0] [verbose] Reinit context to 1920x1088, pix_fmt: yuvj420p
[8/3/2023, 11:22:12 PM] [Camera FFmpeg] [Doorbell] [warning] Guessed Channel Layout for Input Stream #0.1 : mono
[8/3/2023, 11:22:12 PM] [Camera FFmpeg] [Doorbell] [info] Input #0, rtsp, from 'rtsp://admin:NSLkdTfq0@192.168.2.100:554/Streaming/channels/101':
[8/3/2023, 11:22:12 PM] [Camera FFmpeg] [Doorbell] [info]   Metadata:
[8/3/2023, 11:22:12 PM] [Camera FFmpeg] [Doorbell] [info]     title           : Media Presentation
[8/3/2023, 11:22:12 PM] [Camera FFmpeg] [Doorbell] [info]   Duration: N/A, start: 0.000000, bitrate: N/A
[8/3/2023, 11:22:12 PM] [Camera FFmpeg] [Doorbell] [info]   Stream #0:0: Video: h264 (Baseline), 1 reference frame, yuvj420p(pc, bt709, progressive, left), 1920x1080 (1920x1088), 30 fps, 30 tbr, 90k tbn
[8/3/2023, 11:22:12 PM] [Camera FFmpeg] [Doorbell] [info]   Stream #0:1: Audio: pcm_mulaw, 8000 Hz, mono, s16, 64 kb/s
[8/3/2023, 11:22:12 PM] [Camera FFmpeg] [Doorbell] ALSA lib confmisc.c:165:(snd_config_get_card) Cannot get card index for Loopback
[8/3/2023, 11:22:12 PM] [Camera FFmpeg] [Doorbell] [alsa @ 0x562a76a76380] [error] cannot open audio device sipdoorbell_main (No such device)
[8/3/2023, 11:22:12 PM] [Camera FFmpeg] [Doorbell] [error] sipdoorbell_main: Input/output error
[8/3/2023, 11:22:12 PM] [Camera FFmpeg] [Doorbell] FFmpeg exited with code: 1 and signal: null (Error)
[8/3/2023, 11:22:12 PM] [Camera FFmpeg] [Doorbell] Stopped video stream.
[8/3/2023, 11:22:14 PM] [Camera FFmpeg] [Doorbell] [Two-way] FFmpeg exited with code: null and signal: SIGKILL (Forced)

I already added the ALSA configuration on both directories (/usr/share/alsa/alsa.conf and /etc/asound.conf).

Where is my mistake?

jmnovak50 commented 1 year ago

Hi,Well, from my perspective I kind of gave up going through HB and decided to go through Scrypted.  Also, I decided to switch doorbells and am now using Reolink under a similar configuration.I’m glad you got it working!Best Regards,JasonOn Aug 3, 2023, at 9:27 PM, VCTGomes @.> wrote: Well, I finally connected successful my Hikvision doorbell with SIP to Baresip. It's registering and even ringing (I just test the indoor panel to phone). However if I try to open the live video on HomeKit, I got the following error: [8/3/2023, 11:22:11 PM] [Camera FFmpeg] [Doorbell] [h264 @ 0x562a76a025c0] [error] non-existing PPS 0 referenced [8/3/2023, 11:22:11 PM] [Camera FFmpeg] [Doorbell] [h264 @ 0x562a76a025c0] [error] non-existing PPS 0 referenced [8/3/2023, 11:22:11 PM] [Camera FFmpeg] [Doorbell] [h264 @ 0x562a76a025c0] [error] decode_slice_header error [8/3/2023, 11:22:11 PM] [Camera FFmpeg] [Doorbell] [h264 @ 0x562a76a025c0] [error] no frame! [8/3/2023, 11:22:11 PM] [Camera FFmpeg] [Doorbell] [h264 @ 0x562a76a025c0] [error] non-existing PPS 0 referenced [8/3/2023, 11:22:11 PM] [Camera FFmpeg] [Doorbell] [h264 @ 0x562a76a025c0] [error] non-existing PPS 0 referenced [8/3/2023, 11:22:11 PM] [Camera FFmpeg] [Doorbell] [h264 @ 0x562a76a025c0] [error] decode_slice_header error [8/3/2023, 11:22:11 PM] [Camera FFmpeg] [Doorbell] [h264 @ 0x562a76a025c0] [error] no frame! [8/3/2023, 11:22:11 PM] [Camera FFmpeg] [Doorbell] [h264 @ 0x562a76a025c0] [error] non-existing PPS 0 referenced [8/3/2023, 11:22:11 PM] [Camera FFmpeg] [Doorbell] [h264 @ 0x562a76a025c0] [error] non-existing PPS 0 referenced [8/3/2023, 11:22:11 PM] [Camera FFmpeg] [Doorbell] [h264 @ 0x562a76a025c0] [error] decode_slice_header error [8/3/2023, 11:22:11 PM] [Camera FFmpeg] [Doorbell] [h264 @ 0x562a76a025c0] [error] no frame! [8/3/2023, 11:22:11 PM] [Camera FFmpeg] [Doorbell] [h264 @ 0x562a76a025c0] [verbose] Reinit context to 1920x1088, pix_fmt: yuvj420p [8/3/2023, 11:22:12 PM] [Camera FFmpeg] [Doorbell] [warning] Guessed Channel Layout for Input Stream #0.1 : mono [8/3/2023, 11:22:12 PM] [Camera FFmpeg] [Doorbell] [info] Input #0, rtsp, from @.:554/Streaming/channels/101': [8/3/2023, 11:22:12 PM] [Camera FFmpeg] [Doorbell] [info] Metadata: [8/3/2023, 11:22:12 PM] [Camera FFmpeg] [Doorbell] [info] title : Media Presentation [8/3/2023, 11:22:12 PM] [Camera FFmpeg] [Doorbell] [info] Duration: N/A, start: 0.000000, bitrate: N/A [8/3/2023, 11:22:12 PM] [Camera FFmpeg] [Doorbell] [info] Stream #0:0: Video: h264 (Baseline), 1 reference frame, yuvj420p(pc, bt709, progressive, left), 1920x1080 (1920x1088), 30 fps, 30 tbr, 90k tbn [8/3/2023, 11:22:12 PM] [Camera FFmpeg] [Doorbell] [info] Stream #0:1: Audio: pcm_mulaw, 8000 Hz, mono, s16, 64 kb/s [8/3/2023, 11:22:12 PM] [Camera FFmpeg] [Doorbell] ALSA lib confmisc.c:165:(snd_config_get_card) Cannot get card index for Loopback [8/3/2023, 11:22:12 PM] [Camera FFmpeg] [Doorbell] [alsa @ 0x562a76a76380] [error] cannot open audio device sipdoorbell_main (No such device) [8/3/2023, 11:22:12 PM] [Camera FFmpeg] [Doorbell] [error] sipdoorbell_main: Input/output error [8/3/2023, 11:22:12 PM] [Camera FFmpeg] [Doorbell] FFmpeg exited with code: 1 and signal: null (Error) [8/3/2023, 11:22:12 PM] [Camera FFmpeg] [Doorbell] Stopped video stream. [8/3/2023, 11:22:14 PM] [Camera FFmpeg] [Doorbell] [Two-way] FFmpeg exited with code: null and signal: SIGKILL (Forced)

I already added the ALSA configuration on both directories (/usr/share/alsa/alsa.conf and /etc/asound.conf). Where is my mistake?

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.***>