Closed meltyMap closed 8 months ago
The problem here is not to do with the amplitude of your signal, but rather with the default parameters of the onset envelope calculation not being a good fit for your particular signal.
The beat tracker expects musical signals by default, and the onset extraction algorithm is tuned accordingly. In this particular case, it's failing because the signal has no high-frequency content (signal appears to be shelved at 2K), and the onset extractor works by computing a median across (mel) frequency bands of spectral flux. Since most of the frequencies in question are above your cutoff, the resulting frequency aggregate is dominated by silence and you get no onset envelope.
There are two ways you could go about working around this:
M = librosa.power_to_db(librosa.feature.melspectrogram(y=audio, sr=sr, fmax=2000))
oenv = librosa.onset.onset_strength(M=M, sr=sr)
tempo, beats = librosa.beat.beat_track(onset_envelope=oenv, sr=sr)
or
M = librosa.power_to_db(librosa.feature.melspectrogram(y=audio, sr=sr))
oenv = librosa.onset.onset_strength(M=M, sr=sr, aggregate=np.mean)
tempo, beats = librosa.beat.beat_track(onset_envelope=oenv, sr=sr)
Or you could mix and match the two strategies. Either should work in your case, and both shouldn't hurt.
Thank you for the help.
I have to admit I did not fully understand those concepts. I tried your code but oenv returned [0. 0.]
If I understood correctly,
oenv = librosa.onset.onset_strength(M=M, sr=sr)
should be
(y=M,sr=sr)
Or are there other parameters that correspond to M?
However, I tried using pitch_shift to directly increase the pitch of the original audio, and it worked! But I'm not sure how much to increase is appropriate, is increasing to above 2000hz enough?
Specifically what I did:
M = librosa.power_to_db(librosa.feature.melspectrogram(y=audio, sr=sr, fmax=1000))
oenv = librosa.onset.onset_strength(y=M, sr=sr)
tempo1, beats1 = librosa.beat.beat_track(y=M,onset_envelope=oenv, sr=sr)
print(oenv)
print(tempo1,beats1)
#return all 0
M = librosa.power_to_db(librosa.feature.melspectrogram(y=audio, sr=sr))
oenv = librosa.onset.onset_strength(y=M, sr=sr, aggregate=np.mean)
tempo2, beats2 = librosa.beat.beat_track(onset_envelope=oenv, sr=sr)
print('-----------------------')
print(oenv)
print(tempo2,beats2)
#return all 0
print(tempo2,beats2)
audio_shifted = librosa.effects.pitch_shift(y=audio, sr=sr, n_steps=10)
tempo3, beats3 = librosa.beat.beat_track(y=audio_shifted, sr=sr, start_bpm=70)
print(tempo3,beats3)
#it worked!
By the way, this audio is actually the sound of blood flow through vessels, so indeed there is very little high frequency sound. I was hoping beat_track would get the heartbeat rate of the patient.
If I understood correctly,
oenv = librosa.onset.onset_strength(M=M, sr=sr)
should be(y=M,sr=sr)
sorry, this should be S=M
(you're now providing a spectrogram input, not a time-domain signal).
However, I tried using pitch_shift to directly increase the pitch of the original audio, and it worked! But I'm not sure how much to increase is appropriate, is increasing to above 2000hz enough?
I think you'd be much better off limiting the frequency range of the spectral analysis as I described, rather than pitch-shifting your input signal. The latter might work in this case, but it's far more complicated than it needs to be, and likely introduces some artifacts (particularly around transients, which is what you're ultimately depending on for beat tracking) that could be avoided.
Yeah! it works! Now I understand almost everything.Thank you so much :D
Describe the bug 1.I have some quiet audio files with rhythmic content. 2.I try to run beat_track on them, but it returns 0. When I amplify the files in an audio editor and run beat_track again, it successfully returns beat information. 3.I try to amplify the audio directly with librosa and run beat_track on the amplified numpy.ndarray, but it still returns 0. 4.However, when I save the amplified numpy.ndarray to a file, load it back in, and run beat_track, it then successfully returns beats again.This confuses me, I don't know if I'm making a mistake or something is wrong.
To Reproduce audio file:2.zip
Example:
Expected behavior beat_track donot works on amplified audio but works after save and load.
Screenshots
Software versions