Closed davies-w closed 1 year ago
Ok, I figured it out.
(1) You can pass in a 2D np array with the waveform at 22050, 16 bit sample in place of the filename.
(2) The output appears to be a 2D array with a seconds column and a beat column (1,2,3,4 or 1,2,1,2)
Can you share the code to find beat/downbeat in mp3 file?
I'm new to this repo and don't fully undertand the guide.
Thank you.
Hey @davies-w, we would love this as well!
Can you share the code to find beat/downbeat in mp3 file?
I'm new to this repo and don't fully undertand the guide.
Thank you.
try this
samplerate = 44100
from BeatNet.BeatNet import BeatNet
estimator = BeatNet(1, mode='offline', inference_model='DBN', plot=[], thread=False)
beatmap = estimator.process('path/to/song.mp3')
beatmap=beatmap[:,0]*samplerate
beatmap is beat positions in samples (audio amplitude array indexes) if that is what you need. You might also want to look at madmom its a much better at beat detection IMO.
from collections.abc import MutableMapping, MutableSequence
import madmom
proc = madmom.features.beats.BeatDetectionProcessor(fps=100)
act = madmom.features.beats.RNNBeatProcessor()(madmom.audio.signal.Signal(audio_numpy_array, samplerate))
beatmap = proc(act)*samplerate
or if you need to find hits, not beats, check this one
import madmom
proc = madmom.features.beats.RNNBeatProcessor(post_processor=None)
predictions = proc(madmom.audio.signal.Signal(audio_numpy_array, samplerate))
mm_proc = madmom.features.beats.MultiModelSelectionProcessor(num_ref_predictions=None)
beatmap = mm_proc(predictions)
beatmap/= numpy.max(beatmap)
i think this one returns probabilities for each sample to be a beat. So you can take for example all ones that are higher than 0.05. But I don't remember. I used this one to generate osu beatmaps https://github.com/stunlocked1/beat_manipulator/blob/main/beat_manipulator/osu.py
@davies-w I am glad you figured out how to get it to work.
@stunlocked1 Thank you so much for your insight. However, Please note that BeatNet and Madmom serve different purposes. Madmom is mainly utilized for offline scenarios and doesn't support different modes such as real-time and streaming use cases. Please note that for offline applications where the real-time, online, or streaming process is not required, BeatNet Offline mode can be used ( your example and mode 4 in readme). It employs a larger CRNN neural net compared to the RNN neural net of Madmom and the same offline DBN inference as Madmom.
@davies-w I am glad you figured out how to get it to work.
@stunlocked1 Thank you so much for your insight. However, Please note that BeatNet and Madmom serve different purposes. Madmom is mainly utilized for offline scenarios and doesn't support different modes such as real-time and streaming use cases. Please note that for offline applications where the real-time, online, or streaming process is not required, BeatNet Offline mode can be used ( your example and mode 4 in readme). It employs a larger CRNN neural net compared to the RNN neural net of Madmom and the same offline DBN inference as Madmom.
for some reason when I used beatnet the beats were less precise, like there was a random offset to each beat compared to madmom that places beats exactly at the start of the beat
I've got bit-rot in my colabs right now, I'll try and remember to update this with an example once I have it working again
OK, I think this should work. Note that it is pulling from my fork of BeatNet, which I'd modified mostly just to deal with dependencies that were conflicting with other audio libraries. I have a song "born.mp3" in the top level of my google drive, and it copies it to the colab space. This won't work unless you have a song with the same name in the same spot. But it's simple to modify obviously.
https://colab.research.google.com/drive/1xWlGqFjXgi-fenVDbGEO-iCzxcsBs3xX
The output is an array of pairs - the first number is the timestamp, the second number is the beat count, 1,2,3,4,1,2,3,4 etc. Note that sometimes it can change time and have a 3 count. Other times it can start on something other than 1 (the down-beat?). I'm not very musical, so I'm just parroting what others have told me.
1) I don't see any examples of how to handle if I have a 2D numpy array of amplitudes? Everything requires me to write it to a file first, which seems under optimal.
2) How do I "read" the output, IE what's in it? I watched the video, and while it's very cool, that's not something I can actually understand. I'm assuming that there'd be something that says 0 secs -> x seconds, 120 bpm, 4/4 time signature, or something along those lines.