DATX02-20-04 / code

4 stars 0 forks source link

Increasing spectrogram size #59

Open christoffer-arvidsson opened 4 years ago

christoffer-arvidsson commented 4 years ago

I've theorized that images of size 128 in height is not enough to accurately represent all frequencies, leading to a dissonant sound. If we tuned the mel spectrogram processing to create images of size 128x256 or even 128x512 then we would have a lot more accuracy in which frequencies are actually in the tone. It might also thin out the horizontal lines, leading to less muddy tones.

Of course, this means larger models to train but I'd be an interesting experiement.

christoffer-arvidsson commented 4 years ago

Currently experimenting with spec sizes of (128, 256). We'll see how it goes