prominenceai / deepstream-services-library

A shared library of on-demand DeepStream Pipeline Services for Python and C/C++
MIT License
272 stars 64 forks source link

WebRTC - Help for Understanding #1260

Open Ben93kie opened 1 month ago

Ben93kie commented 1 month ago

Hi! Recently stumbled across your repo, looks impressive!

However, I'm a bit unsure whether it serves my goal. I'd like to have a live-feed fed into Deepstream, primary inference, and then send out the video stream and the synchronized metadata (bboxes) via WebRTC to a browser (single client would be fine).

I've seen the WebRTC sink

https://github.com/prominenceai/deepstream-services-library/blob/master/docs/api-sink.md#dsl_sink_webrtc_new

and

https://github.com/prominenceai/deepstream-services-library/blob/master/examples/python/1file_webrtc_connect_post_play.py

but I'm not sure if that's doing what I'd like to have. Could you point me in the right direction?

Thank you!

rjhowell44 commented 4 weeks ago

I apologize for the late response @Ben93kie... just back from a short holiday.

re: "send out the video stream and the synchronized metadata (bboxes) via WebRTC"

Are you wanting to send the metadata as message data through the WebRTC data channel? Or do you just which to add the bboxes and labels to the video for visualization with the On-Screen-Display and send the augmented video to the client via WebRTC?

Just FYI, the WebRTC implementation is based on what little information and examples I could find... and currently supports encoded video only. The data channel is unused.

Ben93kie commented 4 weeks ago

Thanks for your response! I'd like to send the boxes separately for downstream custom visualizations in the browser. I know there is a data channel, but I'm not sure how I would feed it the metadata. Especially, in view of needed tight synchronization. Currently, I'm using the appsink example (python example imagedata-multistream) and adapted it to do a python-based websocket server to send out mjpeg video. But, I'm not happy with the performance.

Could you point me in the right direction maybe? Thanks!

rjhowell44 commented 3 weeks ago

I believe some users are streaming the video using an RTMP sink and the Metadata using a Message Sink ... streaming both to something like a media server. If you can do your visualization work in the media server, then clients can connect to the server using WebRTC or other protocols.

Using the WebRTC Sink and data channel to send the metadata is something I would need to research and develop.

Ben93kie commented 3 weeks ago

Nice thx! I was looking at this and got it to run. Apparently, this runs kind of natively within a Gstreamer pipeline. I wondered whether this works also with Deepstream. The main question is how to ingest the metadata (bboxes) in the data channel in a synchronized fashion. One could do with appsink, but I worry about performance..