Mach1Studios / m1-spatialaudioserver

Backend for serving custom streaming spatial audio players and includes a frontend web client example
3 stars 1 forks source link

Streaming without Dash + why does the channel merger use channels * 2 inputs? #1

Open Avnerus opened 2 years ago

Avnerus commented 2 years ago

Hi! Thank you for open-sourcing your work. I and @splnlss @NYTimesR&D have been experimenting with the Web Audio client and were able to migrate the web client in this repo to the legacy player. Instead of preloading the audio files or using a Dash server, we can now use the client via a standard <audio> MediaElementSource with 8 channels, so streaming is done natively. Do you foresee any problems with this solution? I also have a question about the code, specifically this line:

const merger = context.createChannelMerger(channels * 2);

So if my audio file has 8 channels, this merger will have 16 input channels. However all of the connections to the merger are done via two channels in here:

gain.connect(merger, 0, position === -1 ? 0 : 1);

Could you help me understand how this works? Thank you! /Avner

himwho commented 2 years ago

I and @splnlss @NYTimesR&D have been experimenting with the Web Audio client and were able to migrate the web client in this repo to the legacy player. Instead of preloading the audio files or using a Dash server, we can now use the client via a standard

This sounds like a great enhancement for original web player example, if you are interested in starting a PR back to that example it might be cool to show others how to input to that client player in more ways such as a Dash stream input.

Do you foresee any problems with this solution?

I don't see any issues with that (as long as the typical channel re-ordering issues aren't hit) and in fact our goal is to opensource as many "client player" examples as we can to help everyone more promptly deploy concepts (hopefully we will make public an iOS example for this Dash server soon).

So if my audio file has 8 channels, this merger will have 16 input channels. However all of the connections to the merger are done via two channels in here

I was about to write an explanation of the steps of our Mach1Decode API, however I think what you linked to is just some leftover tests we need to cleanup where we were going to add a spatial playback on the admin side of the page but decided against it and keep it as a multichannel debug analyzer...

let me clean that up and push to main

himwho commented 2 years ago

we will also add more commenting and documentation, that said just in case your question still remains for the spatial playback side (clientplayer) for why we double the input channels here is the reasoning:

The general design goal of of the API is to have the developer setup an agnostic spatial mixer with their native player, in this example we have the API setup to handle *2 channels which forces the developer to setup a pre-panned spatial mixer array in case the native audio library being used doesn't support updating panning as well as gain coeffs and more importantly to keep things very verbose and let developers add/improve where they want (and make it more readable for now). We were working on more examples that reduces the audioplayer array to the input size (you can view examples of this in our transcode->decode examples on iOS/commandline), but with this example you can create your own spatial playback pipeline with abstracted audioplayers or implement a more inline version yourself (or for handling buffers of audio instead of players) to allow the same API with the same design functioning in all cases.

To clarify what is happening when using Mach1Decode on an array of abstracted audioplayers:

himwho commented 2 years ago

yeah after reviewing it looks like we could cleanup and rename some of these vars and add some comments

Avnerus commented 2 years ago

Thank you @himwho for the elaborate responses! I will look into submitting my work as a PR. Are you aiming to maintain backward compatibility with the old way of loading the buffers and in separate audio files? Or would it be OK to create a new player that can handle only 8-channel files via HTMLAudioElement?

himwho commented 2 years ago

We will keep our API as a coeff based calculator but just make examples that show how to apply that to:

i hope that answers the question but as for this specific example we will likely want to show how to handle all 3 cases. Right now all 3 cases have scattered examples in our Mach1 Spatial SDK, but we can always improve on this!

himwho commented 2 years ago

if you create a new player we can figure out how to best add it or worst case scenario take our lesser design and make a new example and keep things clean....