bmcfee / ismir2017_chords

ISMIR 2017: structured training for large vocab chord recognition
BSD 2-Clause "Simplified" License
48 stars 3 forks source link

"the augmentation engine shifts" are specified as [-6,6] ! they should be [-6,5] #5

Open cloudscapes opened 4 years ago

cloudscapes commented 4 years ago

Hi there!

In the 00 - Augmentation.ipynb file you create the augmentation engine by specifying the number of shifts as n_semitones=[-1, 1, -2, 2, -3, 3, -4, 4, -5, 5, -6, 6]. Well, let's say our audio file (the song) is in the key of C major so following the shifts we get B(-1 down), Db(+1 up), Bb(-2 down), D(+2 up), A(-3 down), Eb(+3 up), Ab(-4 down), E(+4 up), G(-5 down) , F(+5 up) , Gb(-6 down), Gb(+6 up) ! as we can observe, we have Gb twice! (-6 down and +6 up) ! The number of semitones shifted should be n_semitones = [-1, 1, -2, 2, -3, 3, -4, 4, -5, 5, -6]. In the Crema package the number of shifts is correctly specified as -6 to + 5 semitones! https://github.com/bmcfee/crema/blob/master/training/chords/00-setup.py. Is it correct to assume that the number of semitone shifts here should indeed be n_semitones = [-1, 1, -2, 2, -3, 3, -4, 4, -5, 5, -6] ???

So every File should create 11 unique deformations and with the original file added, we should have 12 files and not 13 files??

best regards, H

bmcfee commented 4 years ago

Is it correct to assume that the number of semitone shifts here should indeed be n_semitones = [-1, 1, -2, 2, -3, 3, -4, 4, -5, 5, -6] ???

Yes, that's correct. In the version for the paper, we had +-6, and corrected it to -6/+5. However, we didn't observe any significant change in performance here, so any bias from the extra tritone representation didn't appear to do much damage.

cloudscapes commented 4 years ago

Thank you, Brian! I appreciate your quick response!

o any bias from the extra tritone representation didn't appear to do much damage

doesn't surprise me tho!

Currently, I'm testing the whole code, so my findings might be interesting for some other people and save them some time; hence I will leave some of these issues open here.

Best regards, H