element-hq / element-call

Group calls powered by Matrix
https://call.element.io
GNU Affero General Public License v3.0
571 stars 89 forks source link

Feature Request: Possibility to record multitrack audio #789

Open MTRNord opened 1 year ago

MTRNord commented 1 year ago

Your use case

What would you like to do?

Being able to record the voice of each participant of the call as a separate audio track.

Why would you like to do it?

For podcasts, it is usually always difficult to record calls as loudness differs and quality differs. Having multiple tracks or each participant would allow it to get processed per participant and afterwards leveled to archive a greater quality result.

While not being an online call an example of audio that would be easier to improve by having each participants' audio as a dedicated track you can look at https://www.youtube.com/watch?v=RNnOHcMLscw (twim). Here the audio is differing a lot. While using auphonic greatly improved this through ML (Compare https://matrix.to/#/!QQpfJfZvqxbCfeDgCj:matrix.org/$uFeNo3BpOYFxVrF3iqHRWCIuskpfamGLVomb2IAcwks?via=matrix.org&via=element.io&via=envs.net for this happening) it comes with artifacts.

Auphonic shows examples at https://auphonic.com/audio_examples#multitrack on how multitrack may help to improve a podcasts quality.

How would you like to achieve it?

A simple way would be to have a button that starts a recording (probably with other users needing to confirm) and to end it. At the end, you get all recordings to save. Additionally, as sometimes things break, it might be nice to retain the recordings in the call room maybe or some other way to be able to redownload it if it failed initially.

Have you considered any alternatives?

There are non paid tools like https://studio-link.de/ that solve this via SIP and being made for podcasts.

You could also just have multiple listening devices in the call on linux and dummy audio outputs. However, this isn't suitable on for example Windows systems where dummy devices are fairly difficult to do.

A direct integration would be the cleanest solution to podcasting.

Additional context

This came up with TWIM and how TWIM may be improved. This would allow each participant to be fixed if needed before uploading to create better results.

MTRNord commented 1 year ago

I had some time to work on this as a bot. I think using a bot might in general be easier.

If anyone comes by, there is https://git.nordgedanken.dev/mtrnord/matrix-call-multitrack-recorder now (not production ready and missing parts of the handshake at the time of writing)