stakwork / sphinx-mac

Sphinx app for mac desktop
MIT License
4 stars 16 forks source link

Real-time Extraction of Screen Frames and Speech from Jitsi WebRTC Call #392

Open tomsmith8 opened 6 months ago

tomsmith8 commented 6 months ago

Description

Provide us with a how to solution to extract periodic screen frames and audio for speech recognition in real-time from a Jitsi WebRTC call. The extracted frames and audio would be processed for further analysis via another API.

Objectives

Suggested Tasks to be reviewed for anything missing:

Access Media Streams

Capture Video Frames

Capture and Process Audio

Acceptance Criteria

JZ1999 commented 6 months ago

Hi @tomsmith8 I would like to help out with this!

gotohigher commented 6 months ago

The first step is to access media streams from the Jitsi Call. This would involve tapping into the WebRTC API to access video, audio, and screen recording streams. We'll ensure each is identified correctly and is accessible.

Capturing video frames can be accomplished via a canvas context. We'd draw the current video frame onto an HTML canvas object, then use the getImageData method periodically to extract frames for real-time processing. A buffer procedure would be in place to handle all this smoothly without interfering with the active call.

We'll leverage the Web Audio API to capture audio. We can use the ScriptProcessorNode (or AudioWorklet for more modern contexts) to process audio samples in real-time. These audio packets can then be stored in a buffer ready for speech recognition.

tomsmith8 commented 5 months ago

@JZ1999 Any update on providing a documented solution with Jitsi/Jibri/webRCT for real-time streaming?

hkarani commented 5 months ago

@tomsmith8 can I work on this? My sphinx username is asterisk32 https://community.sphinx.chat/p/cmv6tnqtu2rk819pr5mg/assigned

tomsmith8 commented 5 months ago

@hkarani sure - we're looking for a provided solution for the bounty. Once we have a provided solution we're happy with we'll look to break the solution out into further bounties (implementation)