John-WL-utils / audio

0 stars 0 forks source link

Ideas #2

Open PladsElsker opened 2 years ago

PladsElsker commented 2 years ago

Running 5 seconds of audio data into the fft algo is probably not enough time to get consitent bpm for every window.

We might need a window the size of half the audio length, or something like that.

Then, we take every bpm, and we only keep the one that was computed the most. We also keep the bpms that are like 5% off of the most computed bpm. Then, we can compute head bangs. Then, we can compute a bpm + offset that fits best the head bangs.

PladsElsker commented 2 years ago

Better idea to find the right bpm (finding the right offset is gonna be really hard if the detection was actually not broken, so I'm not talking about the offset for now. However, the bpm finder based on dft is stable enough to start thinking about how to statistically analyse bpm based on it).

Let's say we computed a list of bpms on the song for different windows. We need to run some sort of dbscan on the bpm list to get clustered bpm data. Then, we can: