c-frame / sponsorship

Link to issues outside the c-frame organization that need sponsors
https://github.com/orgs/c-frame/projects/2/views/1
6 stars 0 forks source link

[naf-livekit-adapter] networked-aframe adapter for LiveKit open source SFU #10

Open vincentfretin opened 11 months ago

vincentfretin commented 11 months ago

Create a networked-aframe adapter for LiveKit open source SFU, it's based on the pion WebRTC stack (Go language).

The interesting part of this stack is using egress plugin to record the audio on the server and transcribe it with whisper for example, see recent experiments:

I'll need to record audio for a 3d meeting project for legal reasons, so I'll work on it. My monthly sponsors could have access to it once I developed it. Please show your interest on this issue by adding a thumbs up and also being a monthly sponsor. If this issue have enough interest, I'll write a proper documentation to self host LiveKit and using the adapter for networked-aframe.

arpu commented 11 months ago

Hi, as an alternative, it cloud be possible to use a gst janus client to record or stream the audio from janus https://gitlab.freedesktop.org/gstreamer/gst-examples/-/tree/master/webrtc/janus

https://github.com/josephlim94/janus_gst_client_py

vincentfretin commented 11 months ago

Hi, It's a bit hard for me to modify a rust code to add new features, Although I learned the language, I didn't do any real rust coding, except modifying some lines of code in janus-plugin-rs (C to rust binding) and janus-plugin-sfu we use for naf-janus-adapter. Also I need this recording feature without bugs for September/October. My use case is to record one to one meetings made by hundreds and later thousands persons simultaneously if this project is signed after the POC :crossed_fingers: LiveKit seems to be a battle-tested and well documented solution for such feature and I normally can code all I need with their javascript server sdk. https://docs.livekit.io/server/egress/

Just for archive, I want to mention that an old version of janus-conference, that is a rust plugin for janus-gateway also using janus-plugin-rs like janus-plugin-sfu had a feature to record and transcode to webm then upload to s3 compatible bucket, but they now modified it to record mjr dumps directly and they now replaying the packets directly from mjr dumps when they need to play again a course. This is my understanding of it just reading their code and PRs, https://github.com/foxford/janus-conference/pull/135 and their dispatcher app is now using the mjr files directly https://github.com/foxford/dispatcher/pull/37

arpu commented 11 months ago

hi,

i think for this we do not need to change the janus-plugin-rs plugin "simple" use the gstreamer webrtc to connect as client and record all audio streams or mix it to one stream -> cdn

other idea is to use the audiobridge janus plugin , this does the mixing to one stream for you https://fosdem.org/2023/schedule/event/janus/

vincentfretin commented 11 months ago

From the server, connecting as client, so subscribing to each participants audio to record doesn't seem to be an efficient way of recording at scale, this will multiply by two the number of janus sessions. janus-conference is using the Recorder C api of janus-gateway to record the packets, see janus_recorder.rs that could be backported to janus-plugin-rs and their Recorder impl, that was what I had in mind if we wanted to record efficiently.

If I am using janus-plugin-sfu and audiobridge in parallel, that would mean the user need to upload their audio twice, not good if the user has a low connection and bad for server bandwidth. Also I need separate audio file for each participant actually, the customer has a solution for transcribing the audio files that needs a separate file for each participant. But also even for other projects, whisper stt works better if the audio is one participant.

vincentfretin commented 11 months ago

About LiveKit Egress, it uses Chrome headless for Room Composite (mixed audio and video in a grid). For Track Composite (mixed audio of all participants) and Track (one audio file for each participant), it doesn't use Chrome headless, and use directly the server sdk and gstreamer to transcode, see schema https://docs.livekit.io/server/egress/#service-architecture I'm interested in Track Egress here, saving each participant audio.

vincentfretin commented 10 months ago

Related blog article: https://blog.livekit.io/livekit-universal-egress-launch/