Open ramyak-mehra opened 1 month ago
Hi @ramyak-mehra! Welcome to str0m!
When you say media encoding/decoding support, do you mean codec integrations like libav, libopus, openh264 etc?
yes, i know that would be a part of much broader support for media devices in general but i just wanted to get the ball rolling i guess
Adding encoding and decoding into str0m is probably not the greatest idea. Everyone has different needs and one of the benefits with str0m is that it is lightweight and doesn't come with "batteries included".
Now building a separate repo or examples that implements an audio/video/mcu/client pipeline utilizing str0m is probably a better idea.
My/our focus has mainly been on server side usage (SFU), which means media devices haven't been that high up on my mind. But of course str0m should also be for clients, and there are definitely advantages of having a tight integration with the codecs when adjusting for BWE etc.
Now building a separate repo or examples that implements an audio/video/mcu/client pipeline utilizing str0m is probably a better idea.
I'm open to exploring how to best approach this. I think it's worth investigating what the advantages of tight codec integrations are. I believe in libWebRTC, adjusting bitrates is just the start. Can the hooks be formulated as a set of traits?
A set of traits that we could then either make concrete in a separate crate, or maybe behind feature flags?
I like the idea of a seperate crate to, have an ecosystem of stuff related to storm that does integrate well with the whole api. I think its mostly related to things you do with the information from BWE.
Can the hooks be formulated as a set of traits? Something along the lines of adjust bitrate.
We can do it via events as well(not sure if it already exists) and the clients can poll it and use the information.
Here is my 2 cents for media client pipeline assuming you want to do as pure rust as possible.
For Video:
If you want to do pure rust I would suggest start experimenting with the av1 encoders and decoders and also add a packetizer/depacketizer to str0m.
https://github.com/algesten/str0m/issues/541
If you don't want to write a packetizer I would start with https://docs.rs/openh264/latest/openh264/all.html which is easy to get started with.
For webcam capture there is: https://github.com/l1npengtul/nokhwa
For rendering I have no clue :)
For audio there is CPAL for audio devices https://github.com/RustAudio/cpal I am sure there is Opus crate out there.
The big missing pieces in open source for rust is a decent jitter buffer and acoustic echo cancellation. A simple jitter buffer is not that hard to implement, acoustic echo cancellation is trickier but one could wrap or port the webrtc one.
I was just think of using gstreamer. They have first class support for rust and wraps battletested audio/video encoders,packetizers and stuff. While it's not pure rust the api is pretty intuative, almost all new components are being written in rust and even servo also uses it for their media related stuff
ffmpeg's libavcodec is also a contender. For work we did a thin wrapper around it: https://github.com/lookback/libavcodec (just for encoding and decoding video right now). Didn't bother to release as a crate since that would require also maintaining it. But it's a pretty neat starting point for making a good libavcodec binding.
I'm always in the camp of "as few dependencies as possible". The str0m crates I contribute to (or control) will always have this as an imperative.
I am with you on the least dependency part, imo thats another reason to use gstreamer. While this may not be part of the str0m crate but adjoining crates it would only depend upon gstreamer for all its multi media needs, from encoding/decoding to media capture and display plus it already has a well maintained rust bindings.
I would say I am a little biased towards gstreamer because thats the one i have worked with the most, but open to hear suggestion from other folks on using any other library or implementation
I recently looked into this project and found it quite interesting and wanted to dig a bit deeper and contribute. I wanted to try to pick up the media encoding/decoding support. I have some experince with pion style apis and gstreamer. I wanted to know what would be a good place to start from.