Closed Pluckerpluck closed 4 years ago
Have a nothing against a stereo mode for bots. I think it makes a lot of sense.
Initial thoughts:
Could this be implemented by encoding one channel separately and appending the other to the end of the packet? Example:
| Old Mumble Clients | [header][seq number][payload (left channel)][position info][payload (right channel)] | New Mumble Clients |
@mkrautz Have a nothing against a stereo mode for bots.
Is that to say regular users should not have the ability to transmit in stereo?
Bots can also use what ever bitrate they want (if server allows, and in the Opus range (testede 320 kbit) )
@bontibon
I think it might make sense to add a new codec type, Multichannel Opus. Perhaps each packet should include a couple of bits for the amount of channels?... I dunno. I suppose it's hard to map an arbitrary number of channels correctly to people's speaker configurations? Haven't thought that one fully through. Perhaps it makes sense to simply define Stereo Opus.
Your clever solution might make sense. I haven't thought too hard about it. We would need to be able to flag to new servers and clients that there is more relevant data in the packet. Perhaps we could require these frames to always carry the 3 positional audio floats, followed by a flag byte. The flag byte, in this case, would signal that there is another channel following the flag byte.
However, I worry that we can't fit stereo frames in our current UDP framing. Though I haven't tested it.
We currently use UDP packets of 1024 bytes. I believe this was done in order to be conservative regarding MTU size, to avoid fragmentation of our packets -- to get the best possible latency.
So, we might need to transmit stereo frames via the control channel. That would be if we limit the general use-case to "bots".
And yeah, regarding your question on whether I think it should just be limited to bots: well, no. Not necessarily.
But if we go with the "clever" solution to keep backwards compatible, I think the packet size would be excessive for VoIP use. We'd practically double the required bandwidth. I imagine an opus encoder configured for stereo can produce compress the stereo stream much better than our hack, so perhaps we're required to introduce a new codec type in order to keep things reasonable from a bandwidth perspective.
Decoding wise we do not need any special signaling. The opus decoder "just works" w.r.t to upmixing/downmixing stereo to mono as needed. I asked in #opus to make sure:
03:29:35 <hacst> Can an opus decoder initialized for two channels also decode mono packets (or the other way around)? It seems like there is some channel count detection going on during decoding but I haven't found upmix / downmix code so I'm guessing no, but I could've easily missed it. Thanks.
03:36:31 <+gmaxwell> hacst: yes.
03:37:23 <+gmaxwell> the design of the codec is such that the encoder channel count and rate don't need to agree (well, multichannel is another matter)
03:38:23 <hacst> So that just works? I was hoping it would (it matches the awesome userfriendly design in other areas) but it wasn't clear to me from the documentation and I didn't find it when skimming the code (just the tracking of the "detected" vs "given" channelcount. Thanks a lot for the information.
03:38:39 <+gmaxwell> Yes, it just works.
So if we make the client audio output path stereo we wouldn't have to do anything else. Of course whether our UDP packet size is sufficient is another question. According to http://wiki.hydrogenaud.io/index.php?title=Opus 47kbps can be enough for baseline stereo audio though you really probably need 64kbps for acceptable music quality.
If there is a chicken or egg problem for testing... I can help with a functional stereo bot. Actually the mumble client downmixes it to mono while decoding opus.
Is there any update to this? Is it possible to do this now -- enable stereo transmission?
We didn't implement the feature yet, but it should be pretty easy to do so:
AudioOutputSpeech.cpp
so that multiple channels are supported.has this been implemented yet??
No
hmmmm, any hope of getting it in the near future? mumble seems to be the best voip app we have found for some ham radio applications where stereo audio would be a big advantage. mixing kind of works, but being able to have proper stereo would be much better.
Would it make sense to make it a priority for 1.4.0 oder 1.5.0? A lot of people would like to have this feature. (The ticket has 22 +1s and I know from my mumble server that several people have been waiting for stereo support for several years now).
I understand that it isn't even a very difficult task? (I sadly don't have the time to dig into this myself :( )
It always depends on how much time and motivation we have. At the moment the code parts that are responsible for the audio system are a real mess which is why it is basically no fun whatsoever to work with it.
On the bright side though @davidebeatrici is thinking about rewriting the audio code and either he directly implements stereo support (no promises though!) or someone will do it after the rewrite (probably not too long after it. As you said this is a feature a lot of folks want).
However there is no ETA for either of this, so as things are I can't promise you that this feature will be implemented until a certain time (or at all for that matter) :shrug:
it sounds like the actual change to enable stereo is rather simple, at least to hard code it with no user option. i downloaded the project and fought through trying to get pyqt5, after finding i only had 32 bit python, then uninstalling qt4 because qt5 didn't seem to be recognized, but now think i'm stuck because i only have vs2012 where it looks like i need 2013 or 2015. am i missing a doc that says how to setup the development environment??
Our current environment supports MSVC 2015, the new one will support 2019.
Ambisonics support is also interesting ...
https://people.xiph.org/~jm/opus/opus-1.3/ (headline "Ambisonics")
am i missing a doc that says how to setup the development environment??
https://wiki.mumble.info/wiki/BuildingWindows But be warned: It's a PITA with the old build env. The new one will arrive somewhat soon-ish so maybe you want to wait for that to happen...
Ambisonics support is also interesting ...
Definitely interesting but the amount of channels required for that would be comparatively big... But maybe as an optional feature it could be very interesting. I think however that this is better suited to its own issue :)
horizontal 1st order ambisonics is 3 channels, that is only one channel more than stereo :)
I would like this because I use mumble to route audio over the network using JACK.
I know this is unlikely to ever actually be added, but I felt the need to put in a request.
The vast majority of Teamspeak groups I've been in have, at some point, used a music bot. This lets everyone listen to the same music at the same time. You get to talk about it, comment about it, and just all round socialise together.
Similarly, we run a music bot on our mumble server. The difference being that mumble does not have support for stereo transmission, so it has to be transferred in mono which is just not as good for music.
The ability to transmit in stereo would be greatly appreciated. You could simply force it down to mono if the user is ever under the effect of positional audio.