Feature: Key detection and improved beat detection

mpogue2 commented 1 year ago

There's an open source library (used by Mixxx) that can do beat tracking/detection AND key detection. It's here, and there are binaries that could be used to test the functionality before including it in SquareDesk.

https://vamp-plugins.org/plugin-doc/qm-vamp-plugins.html

It is supposedly better than SoundTouch's tempo detection, which is what we use now.

mpogue2 commented 1 year ago

Also note that the beat detection (which is not BPM detection, it's every single beat) could be used to do something very much like Rich Reel's tempo mapper (I don't know what he actually calls it), which he wrote in LabView. It does something similar to what Capstan does. Very cool, especially for older recordings that were on vinyl, where the tempo was moving around a lot.

mpogue2 commented 1 year ago

This should also allow us to make our default loops (patter) and section separation (singers) a LOT more accurate, hopefully beat accurate, or maybe even better, sample accurate (with a "nearest zero crossing" splice that has no audible glitching). For NON-default loops, we should be able to get "right on" a lot quicker, too, when in/out buttons are pressed.

mpogue2 commented 1 year ago

This thing works really really well. I tried it on a number of patter songs, and it did complete beat detection, and it sounds to me like it is 100% accurate.

To try this:

Download Sonic Visualizer
Download the Beat/Bar detector plugin, and install into /Library/Audio/Plug-ins/Vamp (gotta make that folder first by hand)
Start Sonic Visualizer
Open the song
Transform > Analysis by Plugin Name > Bar and Beat Tracker > Beats
Set Channels to use channel 1 (I don't know if this matters)
Set Tempo Hint to 125
Click OK

Then, play it back. It will click precisely on every beat.

Note that strong boom-CHUCK will detect the CHUCKS.

I can see using this to do a more precise automatic loop detector. This thing finds the measures too! So, I think we can splice directly from beat 4 near the end to beat 1 near the beginning. Should be nearly sample accurate, I would think!

Also, the BPM detector is probably way more accurate than miniBPM, given that all the beat times seem to be right on.

mpogue2 commented 1 year ago

Developer downloads are here: https://www.vamp-plugins.org/develop.html

mpogue2 commented 1 year ago

NOTE: I could also use the partially-implemented marker facility to allow for bar markers! This would also be what I wanted to implement for the Bluetooth-go-back-to-previous-section feature (used in VR to redo a sequence, but would also be good for going back 8 bars to the beginning of a section.

mpogue2 commented 1 year ago

To experiment with this code:

Go here and download the 1.8 SDK source code: https://code.soundsoftware.ac.uk/projects/qm-vamp-plugins/files
Note: it already includes the QM-DSP library code
In Makefile.osx, change "../vamp-plugin-sdk/libvamp-sdk.a" to "./lib/vamp-plugin-sdk/libvamp-sdk.a"
cd qm-vamp-plugins-1.8.0
make -f build/osx/Makefile.osx
cd qm-vamp-plugins-1.8.0/lib/vamp-plugin-sdk/host
Stick a WAV file in that directory to process
run this: "./vamp-simple-host qm-vamp-plugins:qm-barbeattracker hawk.wav" to output the beats and the measures

I did this with Hawk Mountain, and I'm finding that it's detecting the chuck in the boom-chuck.

mpogue2 commented 1 year ago

mpogue@Mikes-MacBook-Pro host % ./vamp-simple-host -s qm-vamp-plugins:qm-barbeattracker hawk.wav

vamp-simple-host: Running...
Reading file: "hawk.wav", writing to standard output
Running plugin: "qm-barbeattracker"...
Using block size = 1024, step size = 512
Plugin accepts 1 -> 1 channel(s)
Sound file has 2 (will mix/augment if necessary)
Output is: "beats"
10752: 4
27136: 1
47616: 2
68608: 3
89600: 4
110592: 1
131584: 2
152576: 3
173568: 4
194048: 1
215040: 2
236544: 3
257536: 4
278528: 1 <-- TRY THIS ONE
299520: 2
...
12563456: 2
12584448: 3
12605440: 4
12626432: 1 <-- TRY THIS ONE
12647424: 2
12668416: 3
12689408: 4

I made the brain-dead simple choices above (knowing that it is finding the "chucks" not the "booms"), and the result is below. It is essentially seamless and sample accurate (to my ears at least). I think that we could interpolate the chucks to find the booms, and it might be right on?

hawk_loop.webm

mpogue2 commented 1 year ago

I'm gonna try:

start: (278528+299520)/2 = 289024 end: (12626432 + 12647424)/2 = 12636928

That moves from the chuck to the following boom. The result sounds really, really good to my ears:

hawk_loop2.webm

SIDE NOTE: I used https://www.freeconvert.com/mp3-to-webm to convert from MP3 captured by Audio Hijack from Audacity (with sample accurate loops set as per the sample numbers above), to WEBM format, which GitHub will let me post (MP3 uploads are NOT supported by GitHub).

mpogue2 commented 1 year ago

The loop points that I used above were for beat 1.5 in measure 4 to beat 1.5 in measure 151. That's 151 - 4 = 147 measures, which is not a multiple of 8. In fact I made a mistake and my starting point was NOT on a 32-beat phrase boundary.

Let me try again.

start: (866816 + 887296)/2 = 877056 = beat 1.5 of measure 11 end: (12626432 + 12647424)/2 = 12636928 (same as before) = beat 1.5 of measure 151

This is still not an integer number of phrases, but it still sounds correct:

hawk_loop3.webm

So, at this point, I'm leaning to an "automatic snap to a beat 1.5" for START LOOP and END LOOP functions. This would be a checkbox preference, so that it could be disabled for some songs. Alternately, holding down OPTION could change the behavior back to the non-snapped version. Or, we could use OPTION to use Beat 1 instead of Beat 1.5 (this might be useful for songs that are NOT Boom-Chuck).

SUMMARY: This means that the user still has to participate in the loop process (it's not 100% automatic), but the result should be a nearly seamless join after just those 2 clicks. The Hawk Mountain loop points are essentially perfect (to my ears).

NOTE: In my browser, listening to the WEBM clips above requires that I click on the little speaker icon to UNMUTE. I don't know why GitHub defaults to audio muted...

Overall, this code is pretty impressive, at least for this patter tune.

mpogue2 commented 1 year ago

And, here's the R code I used to analyze the output of vamp-simple-host:

library(tidyverse)

rm(list=ls())
theme_set(theme_bw())

setwd("/Users/mpogue/_____BarBeatDetect/qm-vamp-plugins-1.8.0/lib/vamp-plugin-sdk/host")

a <- tibble(line = read_lines("hawk.beats.txt")) %>% 
  separate(line, c("sn", "beat"), convert = TRUE) %>% 
  mutate(measure = cumsum(ifelse(beat == 1,1,0)),
         timeInSong = sn/44100.0)

print(n=25,head(a,25))
tail(a)

mpogue2 commented 1 year ago

I wrote some more R code to do a "Rich-Reel-style tempo plot", where beats that occur at regular intervals result in a horizontal line (if we estimated the BPM correctly). Rich wrote a program that has a very nice shaded display of the amplitude of the song at each moment in time. With this kind of a plot, it's easy to see where the tempo is not constant. There is a commercial product called "Capstan" that is super expensive, that does this detection (and wow/flutter too), and then lets you correct it. I think with this code to generate a tempo map, we might be able to do essentially the same thing, correcting old recordings for wow/flutter and tempo changes.

On Hawk Mountain, using this library code to detect every beat, we get this:

tempoPlot

So, the tempo is almost exactly 126 BPM, and the tempo is not changing throughout the patter recording. (That's kinda what I expected, since this is a Riverboat song...)

The 4 points that are plotted for each X position are the 4 beats in each measure, as per the vamp beat detector. Y value is the residual after subtracting out where the timeline should be at beat 1 of each measure, if the tempo is constant. The BPM estimate is in the title. This estimate is just (endBeats - startBeats)/(endTime - startTime), it's not doing anything fancy like linear regression to determine the slope of each line.

mpogue2 commented 1 year ago

Capstan: https://www.celemony.com/en/capstan It's $199 for a 5-day rental license!

mpogue2 commented 1 year ago

I wonder if we used a low-pass filter on the music BEFORE we do the beat/measure detection, would it help to find the Booms instead of the Chucks? Snare drum might go away, leaving bass guitar and kick drums intact. Might be worth a try.

mpogue2 commented 1 year ago

Yeah, a Low Pass Filter of 1500Hz, 12dB per octave (using the AULowPass plugin in Audacity) worked really well.
Beat 1 as detected by vamp is now the true first beat of a measure.

./vamp-simple-host -s qm-vamp-plugins:qm-barbeattracker hawk_LPF1500.wav -o beats.lpf1500.txt

beats.lpf1500.txt:

5120: 3
16384: 4
36864: 1
57856: 2
78848: 3
99840: 4
120832: 1
141824: 2
163328: 3
183808: 4
204800: 1 <-- this one
225792: 2
246784: 3
267776: 4
...
11880960: 1
11901952: 2
11922944: 3
11943936: 4
11964928: 1 <-- this one
11985920: 2
12006912: 3
12027904: 4
...

Result: hawk_loop_LPF1500.webm

No half-beat boom-chuck compensation required! Result sounds perfect to me (NOTE: this is the low-pass filtered version, looped. In reality, we'd use the calculated loop points on the real, non-low-pass-filtered audio).

I would SO like all my loop points to be this clean!

mpogue2 commented 1 year ago

Note also: Since beat detection seems pretty reliable with this library so far, for singing calls we really just need to set the Intro point for beat 1 of the Opener. The Outro point should (in theory) be exactly 7 * 64 beats after that point. I own only one song (Ring of Fire) that violates this.

For patter, it should also be possible (in theory) to look ahead an exact number of 32-beat phrases to calculate the Outro loop point. Patter structure seems to me to be less rigid than singers structure, though.

Given this, I'm tempted to make an upload feature, where carefully chosen loop points could be uploaded to the cloud, and shared with others (matching them up by Label Number, for example). That way, it's ALMOST automatic (if you have an internet connection)... We could also make it so that you can get these precise shared loop points only if you yourself share the loop points that you've figured out. The music producers could do their entire catalog this way, and share it with all of their customers. If more than one person chooses the loop points, we could have a "vote", or just make them all available.

mpogue2 commented 1 year ago

(11964928 - 204800)/44100 = 266.6695 seconds (266.6695 sec / 60.0 sec/min) * 126BPM = 560 beats = 17.500 32-beat phrases

Yeah, Hawk Mountain has a 16-beat bridge (half a 32-beat phrase) right at 96 seconds. So, that explains why it's 17.5 phrases, and not 16.0 or 17.0 phrases between loop points.

This really suggests to me that the user has to be involved in selecting the Start and End loop points, in general. Maybe the other functions in the library can help to figure this out (Key Change or Segmenter), but I think it's easy enough for us to snap the In/Out points to beat 1 of a measure, and this will work really well 99.99% of the time.

mpogue2 commented 1 year ago

After testing out the code using Sonic Visualizer, the Key Detector doesn't seem to work that well on Hawk Mountain. The Segmenter doesn't work so well either. Neither gives me the segmenting into 32-beat phrases that I was hoping for.

So, I think the best approach is still to have the user click a button for Start Loop and End Loop.

For singing calls, we should be able to color the whole thing after clicking on just the Intro button, since we know exactly where all the beats are.

mpogue2 commented 1 year ago

qm-vamp-plugins.dylib on MacOSX is about 1.2Mb. I think that's the only thing that we need to interface to this thing. While there are pre-built binaries available for Mac, Windows, and Linux, I think we should just stick the latest 1.8 source code into our repo, and build it for each release.

mpogue2 commented 1 year ago

For UX, I'm thinking something like this:

Preferences: [X] Snap Loop Points/Sections to Nearest" {Beat|Measure} <-- pulldown menu to select granularity of snap [X] Set Out Point Automatically for Singing Calls (Start + 7*64 beats)

Buttons: If Patter, Start Loop and End Loop buttons work as normal (both enabled). If Singer, IN button enabled. OUT button is disabled, if "Set Out Point Automatically..." is checked (hover text is "Automatic Out Point Setting is Enabled", which makes the second checkbox condition discoverable).

Internals:

Beats and measures are calculated once for each patter and singer, at LOAD time
Other types are NOT calculated, unless they are listed in Music Types preferences under "Patter" or "Singing"
Results are cached in the DB as sample numbers in text format
Instead of sample numbers, the first number will be the first sample number. Subsequent sample number will be that sample number minus the previous one (delta encoding). This drops a typical string (e.g. Hawk Mountain) from 5317 bytes down to 3898 bytes.
Furthermore, the resulting delta string can be bzip2 compressed down to 187 bytes. This records all of the beat and measure positions in the entire song in 187 bytes (e.g. for Hawk Mountain)

Example:

would be first converted to something like this:

1       0 5120,11264             
2       1 20480,20992,20992,20992
3       2 20992,20992,21504,20480
4       3 20992,20992,20992,20992
5       4 20992,20992,20992,20992
6       5 20992,20992,20992,20992

And then to a single string (3898 bytes), like this: [1] "5120,11264:20480,20992,20992,20992:20992,20992,21504,20480:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,21504:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,21504,20480:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,21504:20992,20480,20992,20992:20992,20992,21504,20992:20992,20480,21504,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,21504:20480,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,21504,20992,20992:20992,20480,21504,20992:20480,20992,20992,21504:20992,20992,20480,20992:21504,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,21504,20480:20992,21504,20480,20992:20992,20992,20992,20992:20992,20992,20992,21504:20480,20992,20992,20992:21504,20480,20992,20992:20992,20992,20992,21504:20992,21504,20992,20992:20992,20992,20480,20992:20992,21504,20992,20480:20992,21504,20992,20992:20480,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:21504,21504,20992,20480:20480,21504,21504,20480:20992,20992,21504,20992:20992,20480,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,21504,20992,20480:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,21504:21504,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20480,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,21504,20480:20992,20992,21504,20480:20992,21504,20992,20480:21504,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,21504,20992,20480:21504,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,21504,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:20992,21504,20992,20480:20992,20992,20992,20992:20992,20992,20992,20992:20992,20992,20992,20992:21504,20992,20992,20480:20992,20992,21504,20992:21504,20992,20480,20992:20992,20992,20992,20992:20992,20992,20992,20992:23552,23552,24064,22528:22528,22528,22528,22528:22528,22528,22528,22528:22528,22528,22528,22528" And then bzip2'ed down to a single binary string (187 bytes), like this:

 [1] 42 5a 68 39 31 41 59 26 53 59 df c2 93 4a 00 04 f8 18 00 00 04 7f 70 30 01 58 00 40
 [29] 34 d3 41 00 d3 4d 02 12 94 f5 27 a9 e9 73 b1 49 11 14 4d 13 41 11 34 06 95 a4 0d 28
 [57] 1a 54 a5 a1 a7 51 20 1a 11 0a 4a 51 c5 4a 94 52 00 6a 81 69 d6 87 45 01 a6 a9 7c 43
 [85] 99 c0 1f ca b9 16 5e 64 45 03 88 f7 e7 df 82 87 fa 88 8f 0e 7a 3d 42 91 89 89 62 95
[113] 16 11 59 09 50 0a 48 09 11 0a 5a 19 21 29 66 12 8e a3 cf 88 f4 44 0e 1e 2a fb b0 ae
[141] 87 a2 87 9f 4e a3 e9 dc 27 a8 b0 78 88 a0 71 11 40 c7 1b 5a 68 4d 2a 50 a1 43 49 a0
[169] 43 11 13 55 a6 c6 3e b7 e1 77 24 53 85 09 0d fc 29 34 a0

That binary string can then be cached in the DB, along with the last modified date of the song itself. If the lastModifiedDate of the song ever changes (e.g. user edits the audio file itself), the cache is invalidated, and the calculation is done again at the next LOAD time. If this happens, the loop points might now be wrong, but that's the user's problem to deal with.

mpogue2 commented 1 year ago

Latest R code that explores this idea (for reference):

library(tidyverse)

rm(list=ls())
theme_set(theme_bw())

setwd("/Users/mpogue/_____BarBeatDetect/qm-vamp-plugins-1.8.0/lib/vamp-plugin-sdk/host")

# filename <- "hawk.beats.txt"
filename <- "beats.lpf1500.txt"

a <- tibble(line = read_lines(filename)) %>% 
  separate(line, c("sn", "beat"), convert = TRUE) %>% 
  mutate(measure = cumsum(ifelse(beat == 1,1,0)),
         timeInSong = sn/44100.0,
         hitnum = row_number(),
         m1 = (measure - 3) %% 8,
         phrase32 = cumsum(ifelse(m1 == 0 & lag(m1) == 7, 1, 0)))

print(n=25,head(a,25))
tail(a)

throwAway = 10 # we have to back off from the start to get into the song
b <- tail(a,-throwAway)
hit1 = b$hitnum[1]
time1 = b$timeInSong[1]
measure1 = b$measure[1]

q = 50 # we have to back off from the end to get into the song
hit2 = head(tail(b$hitnum,q),1)
time2 = head(tail(b$timeInSong,q),1)

bpm = 60.0 * (hit2 - hit1)/(time2 - time1)

d <- b %>% 
  head(-q) %>% 
  mutate(residual = timeInSong - (60.0 * (measure - measure1) * 4 / bpm))
head(d,15)

# plot the residuals
#  If it's a horizontal line, then we have calculated the BPM correctly
#  and there are no tempo changes in the middle of the song
p1 <-
  ggplot(d, aes(x = measure, y = residual)) +
  geom_point() +
  ggtitle(paste0("Rich Reel-style beat plot, assuming BPM = ", bpm))

png("tempoPlot.png", width = 640, height = 480)
  print(p1)
dev.off()

# --------------
head(a)
e <- a %>% group_by(measure) %>% summarize(s = paste(sn, collapse=","))
f <- paste(e$s, collapse=":")
str_length(f) # 5317

e2 <- a %>% mutate(sn2 = sn - lag(sn, default = 0)) %>% group_by(measure) %>% summarize(s = paste(sn2, collapse=","))
head(e2)
tail(e2)
f2 <- paste(e2$s, collapse=":")
str_length(f2) # 3898

g2 <- memCompress(f2, type="bzip2") # 187 bytes
h2 <- memDecompress(g2,asChar = TRUE)
str_length(h2)

mpogue2 commented 1 year ago

Initial commit of Part 1 of proof-of-concept: 0ce7b8a8786f73d9b60934087040d51816403fed

This test code uses an external program from the vamp distribution. Disabled by default. If enabled, we convert all the audio to mono, LPF to 1500Hz, write out the audio as WAV, and call vamp-simple-host to do beat/measure detection. Text results are sucked back in and compressed to a short string. It then decompresses to make sure that compression and decompression worked. Future parts will snap In/Out buttons to beats/measures, and beat/measure detection results will be cached in Sqlite.

NOTE: The Vamp 1.8 code is not checked into the repo yet. When it is, we need to add it as a subproject, get it to build (uses just "make", so build is pretty simple), and copy the resulting executable (from qm-vamp-plugins-1.8.0/lib/host) and associated dylibs (from qm-vamp-plugins-1.8.0/lib). Then, make sure that the vamp-simple-host uses the LOCAL copy of the dylib, so that our SquareDesk executable is entirely self-contained. Then we can reference the vamp-simple-host executable with a RELATIVE pathname, and Bob's your uncle.

Code already has fixed pathnames for WAV and RESULTS files (for debugging), and Temp File pathnames for the final executable. Temp Files are auto-deleted, while fixed pathnames for debugging are not.

I again tested the results of THIS code (as opposed to the manual vamp-simple-host testing I did earlier), and the results still sound great. I just picked some measure 1 numbers that looked about right, and made a loop in Audacity, and played that loop. It sounded seamless to me. So, I'm pretty sure that:

calling out to the vamp-simple-host code works
all the compression stuff works (compresses Hawk Mountain beats down to 232 bytes. I decided to Base64 encode the qCompress-ed beat info to a QString, to make it easier to view in the SQLite DB.
decompression works too
temp files are deleted automatically
Intermediate WAV file sounds correct (1500Hz LPF'ed using KFR)
the right beat locations are calculated by vamp

mpogue2 commented 1 year ago

Might want to switch the QVector to a std::vector, so that I can use lower_bound(), which QVector does not appear to have.

https://alexsm.com/cpp-closest-lower-bound/

long search_closest(const std::vector<double>& sorted_array, double x) {

    auto iter_geq = std::lower_bound(
        sorted_array.begin(), 
        sorted_array.end(), 
        x
    );

    if (iter_geq == sorted_array.begin()) {
        return 0;
    }

    double a = *(iter_geq - 1);
    double b = *(iter_geq);

    if (fabs(x - a) < fabs(x - b)) {
        return iter_geq - sorted_array.begin() - 1;
    }

    return iter_geq - sorted_array.begin();

}

And, change this function to returning the actual value, rather than the index (which I don't need).

mpogue2 commented 1 year ago

Perhaps a better UX:

Only calculate the Beats/Bars when the IN/Start Loop or OUT/End Loop buttons are pressed, and only if we haven't already calculated and cached the result (n memory) from a previous click of one of those two buttons
In the Music menu, near the Loop menu item, add "Snap to:" with submenus: "None", "Nearest Beat", "Nearest Measure" (mutually exclusive checkables, defaults to "Nearest Measure" if never set, choice is persistent)
Same place, add a checkable item: "Automatically set OUT point from IN point for singers". If that item is Checked AND a singer is loaded, change the name of the IN button to IN/(OUT). Provide hover text to tell user what's going on (IN + 7 * 64 beats). Defaults to "Checked" if never set (choice is persistent). (NOTE: this would then require that the nearest function return an index, not a value! So, the above code might be OK as-is...)
When one of the buttons is clicked, immediately record the play head's location. THEN, do beat detection (if not done already, this might take a second or two), and then snap to the nearest {Beat,Measure} (if snapping is enabled).
Do NOT store anything in the DB. In and Out points are saved to the DB as usual.
Make sure that these points are sample accurate! Since we store times and not sample numbers, we need to make sure that we have <22.6µs resolution on our saved floating point values. float only has 7 decimal digits of precision, while double has 15. We calculate intro/outro using doubles, and they are stored into the DB as "float" (which in SQLite terms is a double). However, note that the in/out spin boxes are based on QDateTimeEdit, which is based on QTime, which only has millisecond precision. FUTURE?: allow user to specify samples vs MM:SS.SSS display, so that sample accurate loop points can be set.
Hovertext on IN and OUT buttons would be a nice-to-have, especially if the hover text is context-sensitive (that is, sensitive to the current snapping modes)

OPTIONAL: to reduce glitches at loop time, ensure that the OUT point is the sample just AFTER the nearest upward-going zero crossing (assuming that it crosses zero between 2 samples), and that the IN point is the sample just AFTER the nearest upward-going zero crossing (same assumption), AND that the loop section includes the IN point and does NOT include the OUT point (so the last sample played before the loop will be the one just BEFORE the nearest upward-going zero crossing).

mpogue2 commented 1 year ago

Some thoughts:

start() in the AudioDecoder should invalidate the cache, which is held only by the AudioDecoder
add a call "newTime_sec snapToClosest(time_sec, none|beat|measure)", which is called by IN/OUT buttons
this way we don't have to manually trigger a beat map, and we also don't have to pass it all the way up to mainWindow either

mpogue2 commented 1 year ago

OK, I've got it basically working, with a fixed granularity (MEASURE). Things I've noticed so far:

THIS WORKS SO GOOD. I've tried it on Hawk Mountain, Hot and Spicy, and Royal Hawkeye, and it was RIGHT ON EVERY TIME. This will save me SO much time. Two clicks, and the loop for a Patter is done!
Tried it with a singer: Somewhere Over the Rainbow, and it was right on, BUT because the SeekBar only updates once per second, I didn't get the red text "Opener" until 1 second after the start of the Opener section. For that reason, we might want to artificially advance the Intro and Outro points for singers by 1 second.
There is a noticeable pause in the UX (e.g. the SeekBar stops moving) when I click IN/OUT for the first time, because it's a synchronous call, and it takes a couple seconds. Subsequent clicks on the button don't cause a UX pause (because the results are in the one-element cache). I'm not sure how to fix this, maybe I do have to make it asynchronous, but then the update of the Intro/Outro fields will be delayed a couple of seconds.
Alternately, I could do this at Load time asynchronously, because most people are NOT playing with the Intro/Outro buttons right after a song is loaded. This seems like it might be the best user experience to me. There is still a CPU penalty at Load Time (if and only if Snapping is enabled), but the UX is essentially not paused, and so everything should seem pretty normal -- most users won't notice.

mpogue2 commented 1 year ago

Part 2 commit: 5dacea576e605fed23779d847591895e3d3ac57c

This commit adds the snapping code. It's disabled right now, controlled by a #define called TEST_GRANULARITY. But, it's plumbed all the way down. There is still a UX pause the first time a button is clicked, but I'll work on that next. Works really good so far! Still requires Vamp be externally compiled, and the path to it is hard-coded right now. But, this basically demonstrates that the concept actually works for real.

TODO:

UX for snapping {NONE, BEAT, MEASURE}
eliminate the UX pause
bring the Vamp code into our repo, so our repo is again self-contained (OK to always compile it for X86, since this will run fine as an external process on either Apple Silicon or X86)

OPTIONAL: set Outro automatically for singing calls (this involves copying some of the code from on_pushButtonSetOutroTime_clicked to on_pushButtonSetIntroTime_clicked), or factoring it out...

mpogue2 commented 1 year ago

Commit: 6d3fe9089007adfca9b0e6ef1d8e28bcbeb97822

UX is in. The Music menu now has "Snap Loop Points" with three submenu choices: "Disabled|Nearest Beat|Nearest Measure". This choice is persistent. I tested it with several more songs, and it's working perfectly so far.

Commit: 8a812b90b58bc70e495013cafe60f434a1f85719

.pro file now copies in the vamp executable and library from a hard-coded path (. If it's not found in the .app file at IN/OUT button time, and snapping is enabled, an error dialog will appear, and snapping will not be done (be sure to turn it off in the Music > Snap Loop Points menu in this case). Instructions for manual Vamp build are in AudioDecoder.cpp .

NOTE: The default in the .pro file is to NOT copy in the executable and .dylib files, because right now it's a manual build (with some minor code modifications needed first). If you have done a VAMP build, and you want the copy to happen, uncomment those 8 lines in the .pro file. I'll uncomment them later when VAMP is added to the repo.

mpogue2 commented 1 year ago

Commit: 2325f2c24b523d0a3c05b8a34d84d00ae53c5bb4

Now defaults to Snap Loop Points DISABLED. In case you built SquareDesk yourself, but you did NOT build Vamp, this is the right default for you. When you DO decide to build Vamp, as per instructions in AudioDecoder.cpp, and you enable the COPY section of test123.pro, then you can manually set Snap Loop Points to Nearest Measure, and it will be remembered. This is a one-time thing you need to do, once you've successfully included Vamp in the SquareDesk.app file.

Also added "QM Vamp" to the About box, to give credit where credit is due. Beat and Bar detection works great so far!

mpogue2 commented 1 year ago

Commit: 31c24e21d518c2f9cce7906d0e08755dadba807f

Now the WAV and RESULTS files that are used for beat detection are true temp files, and they are deleted automatically after use. I had this code set up for debugging, so it would have tried to stick those temp files into /Users/mpogue (which doesn't exist on most people's machines... :-) .

mpogue2 commented 1 year ago

If we ever want to compile vamp for ARM, this page might be helpful: https://stackoverflow.com/questions/19439407/how-to-compile-vamp-plugins-to-ios-armv7

mpogue2 commented 1 year ago

TODO:

eliminate the UX pause on first IN/OUT button press. I think the best way to do this is to do beat/bar detection at load time (my original thinking), BUT, do it asynchronously, so that it doesn't pause the UX like it does today. It's very unlikely that anybody would think to load a song and then instantly switch to the Cuesheets page and click on a button BEFORE the beatMap and measureMap are done. Although I can live with this pause right now (the benefits of beat detection far outweigh the cost), the pause is annoying.
slurp the Vamp source code (v1.8.0) into our repo, similar to taglib. I suppose there's enough info here for people to do this on their own, but it's probably somewhat painful.

LATER:

should I implement the singing call optimization (OUTRO auto-set to INTRO + 7 * 64 beats) now or later? It just saves about 30 seconds of work, so I'm inclined to leave this feature for later. I think that the UX might be a little tricky too. I'll have to think about that more.
fancy zero crossing splices (leave this for a later feature, too)
Allow spin boxes for intro/outro to be switched to long int's (sample numbers, rather than mm:ss.sss). The current spin boxes are NOT sample-accurate.
Do a beat plot, like Capstan and Rich Reel's app. For extra credit, use that result as a tempo map to do tempo correction.
Beat detection for Windows and Linux [Dan?]

mpogue2 commented 1 year ago

Commit: af2c3604e67f344448efcd58214aba0e2a5bbd52

I moved the VAMP part of beat detection to be asynchronous. The only synchronous part now is the writing of the mono WAV file. It's also clever about running Beat detection only if Snap is ON (either Nearest Beat or Nearest Measure). It's even smart enough to initiate beat detection when you switch modes AFTER a song has been loaded. This adds about 0.5sec to the load time, which makes is slightly slower. But, the bulk of the time (VAMP) is about 1.6 seconds, and is asynchronous. There should be no add to load time, if beat detection is disabled.

OPTIMIZATION: Perhaps I can make the writing of the mono WAV file also asynchronous (on-demand) sometime in the future. That would save about 500ms of visible UX delay when Beat Detection is ON. Right now, the async part is done in a QProcess. I think I'd need to move to an async QThread that writes the WAV file and then invokes the async QProcess.

mpogue2 / SquareDesk

Feature: Key detection and improved beat detection #806