MTG / essentia

C++ library for audio and music analysis, description and synthesis, including Python bindings
http://essentia.upf.edu
GNU Affero General Public License v3.0
2.83k stars 530 forks source link

Zeros extracting chromagram with NNLSChroma and LogSpectrum #948

Closed xaviliz closed 4 years ago

xaviliz commented 4 years ago

Hi all,

I am trying to extract NNLSChroma with Essentia library using LogSpectrum. I got the chromagram but something is wrong. In theory, NNLSChroma is waiting for meanTuning as VECTOR_REAL when meanTuning returned by LogSpectrum is a MATRIX_REAL defined by a N x nBPS VECTOR_REAL, a vector for each frame.

So, I have tried to sum each nBPS x 1 VECTOR_REAL as original NNLSChroma does, but the resulting chromagram, bassChromagram and tunedLogFreqSpectrum are filled with zeros.

I was revising this C++ example, but I cannot get similar results in Python https://github.com/MTG/essentia/blob/master/src/examples/standard_nnls.cpp

Here my code:

from essentia import Pool, common
from essentia.standard import MonoLoader, Windowing, Spectrum, LogSpectrum, NNLSChroma, FrameGenerator

windowType = 'hann'
sampleRate = 44100
frameSize = 16384
nBPS = 3

window = Windowing(type=windowType, size=frameSize)
spectrum = Spectrum(size=frameSize)
logSpectrum = LogSpectrum(frameSize=int(frameSize/2) + 1, binsPerSemitone=nBPS, sampleRate=sampleRate)
nnlsChroma = NNLSChroma()
pool = Pool()

audio = MonoLoader(filename=filePath, sampleRate=sampleRate)()
for frame in FrameGenerator(audio, frameSize=frameSize, hopSize=hopSize, startFromZero=True):
            spectrum = spectrum(window(frame))
            logFreqSpectrum, meanTuning, localTuning = logSpectrum(spectrum)
            pool.add('features.logSpectrogram', logFreqSpectrum)
            pool.add('features.meanTuning', meanTuning)
            pool.add('features.localTuning', localTuning)
tunedLogfreqSpectrum, semitoneSpectrum, bassChromagram, chromagram = nnlsChroma(pool['features.logSpectrogram'], np.mean(pool['features.meanTuning'], axis=1), pool['features.localTuning'])

Please, can you provide a Python example or any clarification in my code to compute these features?

Thanks in advance, XL

palonso commented 4 years ago

Hi @xaviliz,

LogSpectrum updates the average tuning internally so using the last value would be more correct than re-averaging. You can do that by setting instead of adding to the pool.

I reproduced your example and also got zeros in the output. While checking our comments, I found that only calculations without NNLS were properly tested and warrantied to work fine, so I'd recommend sticking to that configuration until we figure out the problem.

Here it's your code snippet with the required modifications:

from essentia import Pool, common
from essentia.standard import MonoLoader, Windowing, Spectrum, LogSpectrum, NNLSChroma, FrameGenerator

windowType = 'hann'
sampleRate = 44100
frameSize = 16384
nBPS = 3

window = Windowing(type=windowType, size=frameSize, normalized=False)
spectrum = Spectrum(size=frameSize)
logSpectrum = LogSpectrum(frameSize=int(frameSize/2) + 1, binsPerSemitone=nBPS, sampleRate=sampleRate)
nnlsChroma = NNLSChroma(frameSize=int(frameSize/2) + 1, useNNLS=False)
pool = Pool()

audio = MonoLoader(filename=filePath, sampleRate=sampleRate)()
for frame in FrameGenerator(audio, frameSize=frameSize, hopSize=hopSize, startFromZero=True):
            logFreqSpectrum, meanTuning, localTuning = logSpectrum(spectrum(window(frame)))
            pool.add('features.logSpectrogram', logFreqSpectrum)
            pool.add('features.localTuning', localTuning)

pool.set('features.meanTuning', meanTuning)  # use only the last value

tunedLogfreqSpectrum, semitoneSpectrum, bassChromagram, chromagram = nnlsChroma(pool['features.logSpectrogram'], pool['features.meanTuning'], pool['features.localTuning'])
xaviliz commented 4 years ago

Hi @pabloEntropia

thanks for your clarification with the average tuning and your code modifications.

I was testing NNLS Chroma during the last days and I found the same bug. When useNNLS is True chromagram is empty. However, when it is False chromagram is exactly the same than the original VAMP plugin implementation.

I hope this helps.

palonso commented 4 years ago

It's nice to know that it works as expected without NNLS. We'll discuss a fix for the NNLS mode in #951