mixxxdj / mixxx

Mixxx is Free DJ software that gives you everything you need to perform live mixes.
http://mixxx.org
Other
4.46k stars 1.27k forks source link

New waveforms have much higher visual gain than legacy ones #12448

Closed fwcd closed 9 months ago

fwcd commented 10 months ago

Bug Description

https://github.com/mixxxdj/mixxx/assets/30873659/5ea174eb-0057-4844-8fe1-4ff5d0eef974

Is this intended?

cc @m0dB

Version

2.5-alpha-164-g6acfe63a39 (HEAD)

OS

macOS 14.2

uklotzde commented 10 months ago

Same here after merging the latest changes from 2.4/main:

image

Doesn't look as before.

uklotzde commented 10 months ago

For comparison the same waveform rendered with an older 2.4 beta version:

image

If this occurs for the current 2.4 version than it should be a release blocker!

fwcd commented 10 months ago

Did a quick bisect over the main snapshots, the last good is 2c2e706b44, the first bad 0f0deb1816, here's the diff. I assume it is this PR:

m0dB commented 10 months ago

Well, either this is the bug or https://github.com/mixxxdj/mixxx/issues/12115 is the bug.

My changes were to have the colored waveforms have the same amplitude as the simple waveform. As reported in 12115 by @ronso0 , it was strange that that wasn't the case.

You will also find my analysis there that shows that the combined high-mid-low is not the same as all.

Anyway, I am fine with either approach but note that there are now two contradicting bug reports.

JoergAtGithub commented 10 months ago

My expectation as user is, that 100% amplitude is the same as the threshold for the clipping LED in the VU-Meter - accross all waveform types.

m0dB commented 10 months ago

I agree that that's the reasonable expectation. Now, why the waveform data filtered.all exceeds 100%, I have no idea. I can look into that but not today. But maybe somebody knows?

In any case, if we want the same waveform amplitude across waveform types, which I think makes sense, the current glsl waveform types are correct.

fwcd commented 10 months ago

I agree, the current waveforms look a bit "clipped", despite not extending to the full height. For example, here's Thunder by Imagine Dragons using the current GLSL waveforms and a visual gain of 1.0:

image

There's certainly compression going on, but it doesn't feel that heavy. Compare the legacy waveforms:

image

Legacy waveforms with a visual gain of 2.0:

image

The preview doesn't look as chunky as the current GLSL waveforms either:

image

m0dB commented 9 months ago

I have been investigating this and I have come to the following conclusion:

    // This gain adjustment compensates for an arbitrary /2 gain chop in
    // EnginePregain. See the comment there.
    m_gain = m_pGainControlObject->get() * 2;

I think this is wrong. Also I don't see any relevant comment in EnginePregain. So maybe this "arbitrary /2 gain chop" was fixed there, but this gain adjustment compensation was not removed?

If you agree this is indeed wrong, I will create a PR to remove this * 2 multiplication.

m0dB commented 9 months ago

Hmmm, investigating a bit more, there is definitely something strange with the overall amplitude data.

Top is combined sqrt(filtered.low^2+ filtered.mid^2+ filtered.height^2)

Middle is filtered.all (aka simple)

Bottom is lines from abs(peakR) to -(abs(peakL), with exact number of samples per line, using the recording output from mixxx.

Maybe filtered.all is not simply the peaks?

Screenshot 2023-12-25 at 18 33 48
m0dB commented 9 months ago

This is with at the top pow(filtered.all, 2.0), which seems to result in the better match than filtered.all. I am going to look at the analyser code to see if can understand why.

Screenshot 2023-12-25 at 18 58 34
m0dB commented 9 months ago

Found it:

 20 inline CSAMPLE scaleSignal(CSAMPLE invalue, FilterIndex index = FilterCount) {
 21     if (invalue == 0.0) {
 22         return 0;
 23     } else if (index == Low || index == Mid) {
 24         //return std::pow(invalue, 2 * 0.5);
 25         return invalue;
 26     } else {
 27         return std::pow(invalue, 2.0f * 0.316f);
 28     }           
 29 }                       
 30       

Does anybody know the reasoning behind this code?

m0dB commented 9 months ago

So, if I undo that scaling I get what I expect:

Screenshot 2023-12-25 at 20 21 49

I will create a PR, but it would be great if somebody with more insight in the waveform analyser and ReplayGain code code shed their light on this...

Swiftb0y commented 9 months ago

Does anybody know the reasoning behind this code?

investigating using the commit history pointed me to this: https://github.com/mixxxdj/mixxx/issues/6352

m0dB commented 9 months ago

Thanks for digging that up. So from reading those comments it looks like the pow(invalue, 2.0f * 0.316f) scaling is intentional. I think it was a bad idea to do that in the analyzer code rather than in the waveform render code. It would be much nicer to have options for the waveform rendering (e.g. scaling: "linear" "low-level-enhanced") and the code would have been less confusing too.

Anyway, I will put the unscaling code behind a condition. At least the next time someone wonders why the waveforms look differently in mixxx than in any audio editor it will be more clear.

m0dB commented 9 months ago

We still need an explanation for that strange *2 scaling to compensate for the "arbitrary /2 gain chop" though...

Swiftb0y commented 9 months ago

I think it was a bad idea to do that in the analyzer code rather than in the waveform render code.

I agree. probably was just easier back then... mixxx has always been a pile of hacks ontop of each other.

I'd highly appreciate if that was getting moved to the waveforms. making it optional could be done later. The only issue I see then is that scaling would get applied twice (once already in the cached waveform data and once in the render)...

m0dB commented 9 months ago

Well, the problem is that this would require reanalysing all tracks. Or maybe there is a versioning mechanism?

Also, maintaining all the different waveform types (or accepting they diverge) will be a pain. Post 2.4 I would really like to do a big clean-up.

Swiftb0y commented 9 months ago

Well, the problem is that this would require reanalysing all tracks. Or maybe there is a versioning mechanism?

Unfortunately not that I'm aware of (though there are heuristics somewhere the invalidate cached data such as waveforms info, though idk where). Reanalysing the waveform data is not a big deal IMO since its fairly quick and fortunately doesn't cause the other analyzers to rerun too. The problem is that I don't know how to automatically invalidate the previous waveforms.

Also, maintaining all the different waveform types (or accepting they diverge) will be a pain.

I assume you don't mean the different renderers but the cached analyzer data? If so, then I agree.

m0dB commented 9 months ago

I assume you don't mean the different renderers but the cached analyzer data? If so, then I agree.

No, I mean that if we decide to "un-scale" the filtered.all data in the waveform renderers, we need to change all of them. Unless we decide we can remove all of them :-)

Swiftb0y commented 9 months ago

yup, eventually I'd be nice to remove them. IMO since we declared them as legacy anyways, I'd propose to ignore them (feature-wise) and only work on the non-legacy ones.

Fyi, this discussion also reminds me of #11833. Maybe a more general waveform-data preprocessing facility is in order anyways...

Swiftb0y commented 9 months ago

is this resolved by #12466?

fwcd commented 9 months ago

It hasn't made its way to main yet, so I'm afraid my build isn't new enough to tell 😄 Hopefully I'll find some time to build 2.4 soon...

m0dB commented 9 months ago

Yes. Fixed.

fwcd commented 9 months ago

Sorry for being so nitpicky here... the scaling is fixed, but the new waveforms still lack a fair bit of detail. Compare GLSL:

Screenshot 2024-01-03 at 04 10 02

to GLSL (legacy):

Screenshot 2024-01-03 at 04 09 51

The peaks feel a fair bit more accentuated. I remember seeing something about scaling frequency bands differently, perhaps that is somehow related?

(Though this should probably be tracked in a separate issue, since the overall visual gain is fixed)

mxmilkiib commented 9 months ago

8406 and #8367 are related, no?

Also, why does changing ReplayGain setting levels not change waveform heights?

Thirdly; I think I mentioned in Zulip maybe previously, but it's hard to tell when a track is clipping past it's own 0dB, or whether is or will clip past the channels 0dB, aka redline, given the current gain(s), or if the waveform calculation is so broken as to chop the top off (as has been discovered and fixed, to a hopefully large degree, though the comment directly prior shows there is still some kind of discrepancy). Imho, because there is no visual 0dB delimitation (for track, or channel, or main gain, or I guess ReplayGain also); it just disappears into black. Displaying in advance when the waveform at the current gain(s) would redline the channel could be done with, say, a red line at top and bottom at those points, but that certainly deserves a separate issue - I'm just musing here because there appears a conflux of multiple things that have been making it really confusing to understand what is literally happening to the signal, and so much ambiguity added together is, to me, fairly scary.