Closed JohannesBrx closed 3 years ago
@JohannesBrx
Thanks for opening this issue and sorry for the late reply here. As far as I understand, you already have a prototype?
My choir also has problems with multiple microphones which are too loud/too quiet, so personally, I'd like to have a feature like this in the client, but since the server does all the mixing work, I assume it's not possible easily?
Please keep in mind that every feature could produce more latency and server load. I'm afraid that this might violate the KISS principle: https://github.com/jamulussoftware/jamulus/discussions/915#discussion-2242562
Would it help keep the server workload down by giving part of the computation to the client? Since the signal is already digitized, the computation would be equally valid if done at the client or the server.
I think the problem with this is that the client just gets the mixed signal from the server. It should know the levels though.
It should know the levels though.
It gets sent the levels so it can display them. But not very often. That's to keep the overheads low.
As far as I understand, you already have a prototype?
Yes, I will try to provide an initial pull request soon and post it here.
but since the server does all the mixing work, I assume it's not possible easily?
I would still rely on the mixing of the server and the client fader levels. To perform the adjustment, you would run a menu command once, which then adjusts the sliders of the client. (Which would normally be done manually.)
Would it help keep the server workload down by giving part of the computation to the client? Since the signal is already digitized, the computation would be equally valid if done at the client or the server.
The computational load should not be a great problem since each client would update the averaged levels every 266ms, based on the pre-computed levels from the server (single number). So the real work for computing the levels from waveform is already done on the server.
I think the problem with this is that the client just gets the mixed signal from the server. It should know the levels though.
Right, I would rely on the levels transmitted every 266ms from the server and compute an average.
It gets sent the levels so it can display them. But not very often. That's to keep the overheads low.
AFAIK, the client gets the levels every 266ms (200 frames or so, CHANNEL_LEVEL_UPDATE_INTERVAL
). Averaging the levels will help to improve the precision.
The value sent is the average, as far as I remember. The client then smooths the display between subsequent values, too.
The value sent is the average, as far as I remember. The client then smooths the display between subsequent values, too.
The value computed by the server is a maximum value of the current peak level which slowly decays. It is sent as a 4-bit value which is displayed by the client without modification in the level meter. Its numerical range is from 0...8, which corresponds to -50dB...0dB. The value 9 is reserved for indicating clipping.
In the meanwhile, I have committed an initial version in the following branch: https://github.com/JohannesBrx/jamulus/commits/AutoAdjustChannelFaders
The adjustment for all faders can be executed with the menu entry Auto-Adjust all Faders
(edit menu). To leave enough dynamic range for weaker channels, I currently quiet a very loud channel near clipping to -30dB down.
The limited accuracy of the server meter value (4 bit) can only be partly compensated by averaging as described above. I did a private experiment by doubling the precision of the value (8 bit per channel instead 4 bit), which drastically improves the accuracy. (However, this would increase the data stream by approximately 0.3%, e.g., 370.1 kb/s instead 369 kb/s for mono/medium audio quality.)
First off - I keep forgetting Volker rewrote my initial implementation and I've still not read it... So yes, that's more like what it's doing!
Second, to make looking over and understanding the change easier, could you raise a pull request against the main master branch -- you can leave it in draft status for now if you're not happy it's ready. You'll also get some automated feedback from the CI suite, if nothing else!
Finally, I'd want to have some input from others about any increase in server bandwidth. Whilst 1Kbps per client doesn't sound much to me, I don't run a large server on cloud hosting. I'm going to tag @sthenos on that point. It's things like choirs with 70-odd clients running, I'm thinking of.
I'm not sure if this is the best way of getting a well harmonized choir sound. One of the drawbacks of Jamulus if used in groups greater than around 15 singers is the individual mix. If one of the members introduces a heavy noise, every participant has to react! An algorithm would'nt help here. The same is true, if all members are using completely different mics/audio interfaces/drivers ... An algorithm can automatically adjust the mix level, but this might not sound well.
I tried to solve this problem in a different way. Our choir with 40 singers is equipped with an very similar setup:
On the server side, the Jamulus singlemixserver branch #599 is running. This SW branch gives the first connecting client the ability to mix the sound for the following clients. Only the first client gets faders, the others won't see them. The first client is a PC/Mac-Client for mixing, the successive clients are only Raspberry clients. For our choir, this is the almost perfect solution:
Yesterday, we had a session with 35 singers without big problems. One problem with a noisy mic was temporary solved with a single mouse click on mute. After re-plug the mic by the singer, I could unmute again (for all).
The same is true, if all members are using completely different mics/audio interfaces/drivers ... An algorithm can automatically adjust the mix level, but this might not sound well.
So what would the person do better who adjusts the sliders for the mix? How can this person deal better with different mics and audio interfaces?
I tried to solve this problem in a different way. Our choir with 40 singers is equipped with an very similar setup:
- Raspberry PI 4
- Headset Mic (FLESH NECKWORN MICROPHONE), Electret Mic
- USB-Stick Audio Interface In total about 85 Euro per system.
While this is a very interesting setup, 3400 Euro for the whole choir is also quite a lot of money for a (hopefully) temporary solution. Especially when several singers have their own notebook or PC which could replace the Raspberry PI.
For our choir, this is the almost perfect solution:
- easy to use client for the singers (JackTrip Virtual Studio UI)
- one person is responsible for the mix, all others can concentrate on singing
The disadvantage is that individual singers are not able to create an individual mix. For example, you might want to increase the voices of the same group or lower distorting voices which you personally consider as distorting. In a real choir the sound also depends on the position within the choir -- which you cannot simulate with one mix for all singers.
The solution you described sounds like an interesting alternative concept. However, it will not be applicable for all choirs. Still you have one person who has to do the mix and could benefit from an automatic adjustment ;-)
@JohannesBrx Is this getting to a point where it can be a spec for some work, or does it need further discussion? We're trying to have Issues be an actionable to-do list. Otherwise, I can move it to discussion if you want and you can write up an agreed spec (or a PR) later?
There already is a PR: https://github.com/jamulussoftware/jamulus/pull/1071
I don't want to convince you but wanted to show an already working solution but with a different approach for larger ensembles.
@ann0see
There already is a PR: #1071
OK so I'll link this Issue to it. I see that's on the 3.7.1 milestone too. And if people want to have parallel discussions about it they can :-)
Thanks!
As a software developer and a singer in, and technician of, a large choir (60+ singers) I'm following this discussion with great interest, and I would like to ventilate some thoughts.
I agree that getting the soundmix right for a large group is quite a hassle, and some form of automation would be welcome. I also agree with the earlier mentioned statement that the required mix will differ depending on your position in the choir (Vocal group, lead singers, etc.) So I really think automating channel faders is NOT the way to go. Most important is a consistent INPUT level for all participants. At this moment we do a sound check before rehearsals and recordings, where every participant has to sing a particular part and he/she will get directions to adjust his/her microphone volume level. After all mic levels are set correctly everybody can use his/her own group settings and (stored) mixes for particular songs. This works really great, except for the sound check which takes a really long time (Especially with those participants who don't have a mic volume control on their interface). So I think it would be more useful to have, first of all, a simple mic volume control on the client side, and secondly some form of automation on setting this mic volume automatically. This automatic volume control would just be turned on for a few seconds during the sound check to set the mic gain so all participants could set their levels automatically at the same time. (and since it takes just a few seconds it can now be repeated between songs, since people tend to sing louder after warming up.) Also a "Peak guard" function that ensures levels will not clip during singing (by temporarily reducing mic. gain) would be very appreciated.
I think this should be easier to implement and also would be more appreciated than an auto-fader-setting which would be the same for everybody.
Also I'm afraid the auto-fader-adjustments will be a significant load on the server side and in particular large groups are already struggling with server performance.
I also agree with the earlier mentioned statement that the required mix will differ depending on your position in the choir (Vocal group, lead singers, etc.) So I really think automating channel faders is NOT the way to go. Most important is a consistent INPUT level for all participants.
Maybe there is a misunderstanding: The automatic channel fader adjustments are made once when you execute it -- it is not done continuously.
Automatically adjusting the faders has the advantage that it is much easier to execute -- by a single keystroke instead of a tedious setup for each client. It also compensates for different group sizes, such that the overall contribution of each group is equal.
At this moment we do a sound check before rehearsals and recordings, where every participant has to sing a particular part and he/she will get directions to adjust his/her microphone volume level. After all mic levels are set correctly everybody can use his/her own group settings and (stored) mixes for particular songs. This works really great, except for the sound check which takes a really long time (Especially with those participants who don't have a mic volume control on their interface).
You will still be able to perform your kind of setup. However, I think most of the choirs would prefer a much easier setup by using automatic channel fader adjustment.
So I think it would be more useful to have, first of all, a simple mic volume control on the client side,
This is discussed in #1030.
and secondly some form of automation on setting this mic volume automatically. This automatic volume control would just be turned on for a few seconds during the sound check to set the mic gain so all participants could set their levels automatically at the same time. (and since it takes just a few seconds it can now be repeated between songs, since people tend to sing louder after warming up.)
This would require that all clients properly execute the input level calibration, while some of them probably even need help installing Jamulus. Some of the clients may not have a software-controlled input level but a hardware knob -- you will not be able to adapt them automatically. If a synchronization is needed you would need new protocol messages on the client and server, being much more complicated than a lightweight client extension like the automatic channel fader adjustment.
I think this should be easier to implement and also would be more appreciated than an auto-fader-setting which would be the same for everybody.
The implementation for auto fader adjustment has already been done and is a lightweight addition to the code. Until now I have heard very positive feedback about the auto-fader adjustment.
Also I'm afraid the auto-fader-adjustments will be a significant load on the server side and in particular large groups are already struggling with server performance.
You are wrong and maybe did not understand the concept: The auto fader adjust adds zero load on the server side. On clients, the adjustment is only executed on demand.
Thanks for your answer. This indeed cleared up some misunderstandings, but...
I think most of the choirs would prefer a much easier setup by using automatic channel fader adjustment.
I disagree on that. Setting the correct mic. level is the most important issue and always required, When some mic, levels are set way too low or too high also fader adjustments won't help. Also stored fader settings for particular songs would be of no use if mic volumes are not set correctly.
This would require that all clients properly execute the input level calibration, while some of them probably even need help installing Jamulus.
Properly execute the input level calibration would be no more than singing a single phrase while pressing a button. And those who needed help to install Jamulus will certainly also have problems adjusting the mic volume if they don't have just a knob to turn.
Some of the clients may not have a software-controlled input level but a hardware knob
Clients with a volume control knop has the least problem with setting their volume, so they would probably not need a auto function. But still "fine adjusting" the mic level in software may be a big help.
-- you will not be able to adapt them automatically.
Volume control is no more than adjusting (multiplying/dividing) the digital values send to the server and can be done by just some simple shift and add instructions on the client side.
If a synchronization is needed you would need new protocol messages on the client and server, being much more complicated than a lightweight client extension like the automatic channel fader adjustment.
Input levels can be completely adjusted at client side, so I don't see any reason why any "synchronization" would be needed.
I disagree on that. Setting the correct mic. level is the most important issue and always required, When some mic, levels are set way too low or too high also fader adjustments won't help. Also stored fader settings for particular songs would be of no use if mic volumes are not set correctly.
Maybe we are talking about different things here: First, setting up reasonable client microphone levels is important such that there is no clipping or the signal level is too low. This could be accomplished by a careful manual calibration procedure like you do with your choir or sometime be done (semi-)automatically once Jamulus has access to the input level. See #1030.
Second, the automatic fader level adjustment is about setting the faders in such way that the groups contribute equally and that each singer in a group contributes equally to his/her group.
Volume control is no more than adjusting (multiplying/dividing) the digital values send to the server and can be done by just some simple shift and add instructions on the client side.
Or it can be done on the server, as it is currently done. Why should we change the concept here? Did you ever see a singer of a large band adjusting his microphone level on the stage? This is done by the mixer.
Input levels can be completely adjusted at client side, so I don't see any reason why any "synchronization" would be needed.
If you leave it up the clients to perform the microphone level calibration, you will always have some clients not performing the calibration at all or doing it not correctly.
So my conclusion is: Rough microphone level calibration is important (no clipping, not too quiet), but is not the topic of this issue and the corresponding pull request. This issue is about fine-adjusting the mixer such that all singers contribute equally.
Adjusting the channel faders may be feasible for a small number of musicians but gets quite tedious for a larger choir with more than 20 channels. An automatic channel adjustment would make the setup much easier. It also would allow the less experienced users to initialize the faders to a reasonable basic setting. The automatic adjustment could be executed during a uniform part of the music piece or during warming up the voice.
I am aware of the feature loading and saving mixer configurations. However, non-reproducible client microphone configurations between different sessions will require frequent adjustments of the mixer configuration and thus require a more or less complicated exchange of such configuration. Finally, someone currently has to manually adjust the levels.
One idea would be computing a moving average of the channel levels transmitted by the server, e.g., in
CAudioMixerBoard::SetChannelLevels
. In my prototype, I use an exponential moving average like the followingvecAvgLevels[iChId] = (1. - alpha) * vecAvgLevels[iChId] + alpha * vecChannelLevel[i];
to compute the volume over several frames, where
vecAvgLevels
is aCVector
with the averaged channel levels. The parameteralpha
controls averaging speed: Withalpha=1.0
, the channel level from the server is directly taken, while, e.g.,alpha=0.1
would mean an extended averaging. This kind of averaging would introduce almost no additional computational load and reduce the quantization of the levels as a side-effect.The volume could then be set by a new menu entry like "Auto-Adjust all Faders". A basic approach could be setting the fader value anti-proportionally to the estimated channel level. In my current version, I would set a near overdriven level to a fader value of 0.4.
When I am interpreting the sources correctly, the server level update interval should be at 266.7 milliseconds. (Computed from a sample rate of 48 kHz, the
CHANNEL_LEVEL_UPDATE_INTERVAL
of 200 and 64 bytes frame size.)