Closed ToGoOrNotToGo closed 3 years ago
The results are correct, however: 1) pitch estimation methods are not ideal themselves; 2) method parameters can be tweaked in particular cases. In most audio tasks the range [80Hz .. 400Hz] is OK. Let's start with this example:
// sinusoid 237 Hz
var signal = new SineBuilder().SetParameter("freq", 237).SampledAt(22050).OfDuration(1).Build();
var pitch1 = Pitch.FromAutoCorrelation(signal);
var pitch2 = Pitch.FromCepstrum(signal, 0, 2048, fftSize: 2048);
// pitch1: 237.0968
// pitch2: 239.6739
For larger frequencies you should increase the default high
parameter. Also, depending on the block size, the results may vary in Autocorrelation mode (you can read about it in books/Internet). Anyway, usually we process audiodata blockwise in relatively small blocks, so it's not difficult to find 'good' parameters:
// sinusoid 750 Hz
var signal = new SineBuilder().SetParameter("freq", 750).SampledAt(22050).OfDuration(1).Build();
// process block of size 2048:
var pitch1 = Pitch.FromAutoCorrelation(signal, 0, 2048, high: 1000);
var pitch2 = Pitch.FromCepstrum(signal, 0, 2048, high: 1000, fftSize: 2048);
// pitch1: 760.3448
// pitch2: 760.3448
PS. FromCepstrum
method expects the length of a signal to be not greater than the fftSize
(which is 1024 by default), so fftSize
and blockSize should be equal (I'll later commit the code that will auto-adjust these parameters).
Once again: Thank you very much for the explanation!
Maybe it is better if the default setting for low and high is full range (0 to nyquist)? And the user can minimize the range for performance reasons... or am I wrong again ;-)
No, the range [0 .. Nyquist] doesn't make much sense. In most 'pitch-estimation' cases the frequency is not very high. I don't remember where exactly I took the range [80Hz .. 400Hz], but I'd still prefer not to change it. For speech processing tasks it's just OK. Although, if there're at least couple other feature requests for changing this range, I'll do it )
Pitch.FromAutoCorrelation returns half the fundamental frequency. The result from Pitch.FromCepstrum is totally different or I dont understand the result. I have tested with a sinusoidal signal.