Open Soundmatters opened 1 year ago
didn't this happen in https://github.com/amiaopensource/astataudit/commit/4bc596da05eb7cb1285207ed7381526e3c4b6b89?
didn't this happen in 4bc596d?
No. That solved the problem of DC offset in the signal skewing the phase filter data. This issue is that amplitude offsets between channels skew the phase filter data.
In the folder of test files (linked above) there are two pngs of two 2-channel mono recordings that illustrate the level/correlation issue. They are the same program, but one version was made with a channel offset of .6 dB and the other has a channel offset of 5 dB: •REC0078_T2869 WITH ONLY point06 DB LEVEL OFFSET.wav.astatsaudit ; OVERALL CORRELATION VALUE = + .96 •REC0078_T2869 WITH 5 DB LEVEL OFFSET.wav.astatsaudit ; OVERALL CORRELATION VALUE = + .82
Here is an additional file which may illustrate the issue better: https://drive.google.com/drive/folders/1j0sdy_byuBzaHmrtsrKV7XdUZkm1TY0k?usp=sharing
Hi @Soundmatters, I've tested this a few ways with having loudnorm and ebur128 put a rolling normalization before the phasemeter test, but the results are messy compared to the axcorrelation graph which we had removed. With the sample you shared, it seems well correlated but has amplitude differences, so I took one channel and offset some of the samples for a few minutes so I could force a loss of correlation. I eventually I added the axcorrelation back in to test that against other methods. From there I wanted to check back to the version of astataudit before axcorrelation was removed and found it was different than my current work in progress.
Here's the graph just before axcorrelate was removed in https://github.com/amiaopensource/astataudit/commit/eaf656a7546c9c2486c38c58d802177904098608. This is on your sample with a section of audio in a single channel offset to force a correlation issue.
Perhaps this was due to the work of the preceding commits, but I see the axcorrelation graph in this commit was problematic. It shows the correlation but the x-axis is halved.
In my work in progress it looks like:
So here the axcorrelation aligns well with the phase graph. They both show the issue but the phase graph factors in amplitude whereas the axcorrelation one doesn't.
So this all has me trying to remember why the axcorrelation graph was dropped as I haven't found anything better than it for plotting phase correlation without factoring in amplitude. Beyond being more accurate it also is much faster than adding in a rolling normalization step before the phasemeter analysis.
Hi @Soundmatters, I think you replied via email rather than at https://github.com/amiaopensource/astataudit/issues/10, so the image attachments didn't come through.
So the way I remember it, we dropped axcorrelation in the reporting because --aside from analysis of audio files with an amplitude offset between channels-- aphasemeter was providing more accurate information; the x-axis issue that you note may very well have been part of that problem. Here are a few examples of unreliable reporting that seem unrelated to x-axis issue though:
1) axcorrelation was reporting values greater than 1; many instances of this in the first 5 minutes of this example 2) in the opening minute in this example, there is a 30-second Dolby A tone (uncorrelated by design) followed by a 30-second 1 k sine wave (which is well-aligned/in-phase in this example); aphasemeter reports accurately, axcorrelation does not
And here is another example of very different reporting. The aphasemeter data here tracks very closely with some commercial software that I use
Please note that the first image that I posted yesterday was not correct; it has been replaced with the correct image.
Hey @Soundmatters, with my latest branch here is a graph of the output, including a revised contextualization of axcorrelate so you can compare the before and after.
1 axcorrelation was reporting values greater than 1; many instances of this in the first 5 minutes of this example
In the new one there's no values over 1. From a skim, I see 0.999989 but no 1's.
in the opening minute in this example, there is a 30-second Dolby A tone (uncorrelated by design) followed by a 30-second 1 k sine wave (which is well-aligned/in-phase in this example); aphasemeter reports accurately, axcorrelation does not
I reran this process with a -m 2
to get the first 2 mins in their own graph. Here's that:
In the new version the aphasemeter values and axcorrelate values are roughly similar. Let me know what you think, if it seems okay, I can merge into a new release for your testing.
Fantastic!
If you don’t think that it would slow down astataudit’s processing time too much, could we leave both the “Correlation” graphic and this updated “Normalized Cross Correlation” in for the time being? It would give us the chance to compare their analysis over large sets of data. And if you could add the same color scale to the “Normalized Cross Correlation” graphic, that might help with the comparison, though not at all essential if you are out of time. Thanks.
Hi @Soundmatters, here's the 2 minute sample with the color patterns matched.
But you're right, there is a notable speed difference. I ran this on WNYC-NSDS-1987-05-22-32942.6 0027 Show Music From France.wav
and compared the last release to my current draft and it's 5:04 for the current draft, and 0:33 for the last release. The axcorrelate does manage a fast and slow algorithm and I was using the slow one, with the fast it takes 1:12 and the graph looks like this. The difference between fast and slow is detailed at https://ffmpeg.org/ffmpeg-filters.html#axcorrelate.
I should note that between the last release and the current draft, the axcorrelation isn't the only addition but there's the spectrum as well.
The default/slow setting certainly seems more accurate.
I’m not sure how helpful this would be, but if dropping some of the other filter graphics would lighten the processing load, I’d say: 1) keep axcorrelate in the default/slow mode, 2) drop zero crossings for now, 3) drop the spectral analysis for now.
The correlation reporting is so important and useful that its accuracy is a primary concern for us.
Maybe, down the road, zero crossings and spectral analysis could be added as options.
Hey @Soundmatters, this is a bit of an investment for future work, but I refactored the way the graph is constructed to separate each analyzer (they were all tangled together before). This should make it a lot easier for me to scale it to add future analyzers.
So with this I can turn them on/off and benchmark. So if I only use one at a time:
astats (without reset) which is for dcoffset 28 seconds
astats (with a reset every frame) 30 seconds
aphasemeter 52 seconds
axcorrelate (slow) 9:42
axcorrelate (fast) 29 seconds
showspectrumpic 34 seconds
all (with fast axcorrelation) 3:13
all (with slow axcorrelation) 9:07
These were just quick single run tests but obviously something was off as the solo run of slow axcorrelation was slower than that with all the others.
I'm wondering about the last graphic in the previous post "all (with slow axcorrelation) 9:07" Is that correct? The Normalized Cross Correlation data looks inaccurate (see the uncorrelated Dolby Tone in the first 30 seconds displaying as almost +1; also other data looks like it is displayed at half its value).
I'm having trouble understanding the comment. You suggest the last graphic is incorrect, but is there one here that is correct?
The Cross Correlation analysis in the two-minute examples that you posted seems accurate. I’m looking at the first 30 seconds of Dolby tone (which should be uncorrelated) followed by 30 seconds of a 1 k sine wave (which should be correlated).
The latest graphic that you posted, “(with slow axcorrelation) 9:07", that Dolby tone looks almost perfectly correlated in the Cross Correlation graphic; we’d expect it to be almost 0, not almost +1.
Here's the sample set again but just with 2 minutes.
axcorrelate only, slow
axcorrelate only, fast
all filters, axcorrelate slow
all filters, axcorrelate fast
Does this still demonstrate the issue you were mentioning?
Note that I added 'best' algorithm for axcorrelate filter, it should be more correct than 'fast' algorithm at similar speeds of 'fast' algorithm, but may give you some wrong results compared to 'slow' one especially when used with float sample format. Anyway you should always use double floating point sample format with this filter due limited precision of floats, by using aformat=dblp prior to axcorrelate, in that case it will be more correct for 16-32 bit inputs almost always even with bigger window sizes.
The speed has improved a great deal, but I wasn't able to get any improved aphasemeter or axcorrelate filter reporting with the recent development build.
I don’t have the skills to do this but, if someone is willing to experiment, it might be worth pushing axcorrelate’s “size” parameter to see if we can get more accurate reporting that way. The range it allows is 2 to 131072. If I’m reading the script correctly, the filter size is set at 1024 now. I think that it would be worth experimenting with the upper end of the range (like 32768 or larger). If we can improve the accuracy that way, it might be worth sacrificing some of astataudit’s improved processing speed.
I created a new test file, found here, which I think illustrates the aphasemeter (called Correlation on the png) and axcorrelate (called Normalized Cross Correlation on the png) reporting issues better than other files that I made in the past. Here is the layout of the test file’s audio data:
Section 1 Timeline: 00:00 - 03:18 ; Nine sine waves each at ~ +1 phase, -18 dBFS (+/- .5 dBFS) Timeline: 03:18 - 03:42 ; One Dolby A tone ~ 0 phase, -18 dBFS (+/- .5 dBFS) Timeline: 03:42 - 10:00 ; music with stereo soundfield ~ +0.1 to 0.7 phase, variable amplitude Timeline: 10:00 - 11:00 ; silence
Section 2 Timeline: 11:00 - 21:00 ; a repeat of the content from 00:00 - 10:00, but with a -5 dBFS offset in channel 2 only, the amplitude of channel 1 remains unchanged from Section 1 Timeline: 21:00 - 22:00 ; silence
Section 3 Timeline: 22:00 - 52:00 ; pink noise, +1 phase throughout ; channel 1 is at ~ -8dBFS throughout, channel 2 starts with 2 minutes at ~ -8 dBFS followed by variable amplitude offsets as low as ~ -21 dBFS.
I've included the astataudit reports for the full file, with the audio file itself, in the link above. I’ve attached a detail of the png report here; it seems like the clearest illustration so far of the problem.
The aphase filter (called correlation in the png) gives results for Section 1 which appear correct and are very consistent with the metering in Adobe Audition (as well other phase/correlation plug-ins). The amplitude offset in the audio data in Section 2 skews astataudits reporting of the aphasemeter filter significantly, most notable in the sine waves.
The axcorrelate filter's reporting seems very consistent between Sections 1 and 2, is not skewed by the amplitude offsets between channels, but the reporting is mostly inaccurate.
Just run axcorrelate filter on that .wav file through direct showwaves filter output, and in dolby A tone section and its amplitude goes up/down (because its measuring it per each sample). Perhaps this graph picks max values instead of mean ones in certain timeline window, that is main reason why results are incorrect.
Hi @Soundmatters, yes @richardpl's clue helped me here. I had been plotting the max level value of the axcorrelated output.
Here is the current state of the output of the aphasemeter filter alongside the current axcorrelate output which relies on the max value:
And here is that same data but plotting the DC Offset of the axcorrelate output, rather than the Max level.
And, as I was curious, here's the min level.
And what it looks like when all 3 are plotted together:
Does switching the plot of the axcorrelated data from the max value to the dc offset resolve the issue for you @Soundmatters?
It certainly seems resolved from this example. Thank you @dericed, and thanks to @richardpl for your insight.
Very interesting and helpfull to see the max., mean, and min. values plotted together. Thanks for adding that example.
Note that I had made big changes in Librempeg version of axcorrelate filter, the math behind slow and best modes should give more correct results than before, and also faster, because unnecessary float divisions (slowed calculations and hurt precision) have been removed.
Factor out amplitude in the Correlation (via phase) filter reporting. As it stands now, level offsets between channels impact the reporting of ffmpeg’s phase filter. A test file is available here: https://drive.google.com/drive/folders/1PieIpN5w_IvTzfaRYoiPmJJCXlHyZXx8?usp=share_link