tsurumeso / vocal-remover

Vocal Remover using Deep Neural Networks
MIT License
1.47k stars 215 forks source link

How to train on mono dataset? Is that possible? #78

Closed MaxGodTier closed 3 years ago

MaxGodTier commented 3 years ago

I've read on a previous issue that training requires a stereo dataset but when I convert my dataset from mono to stereo it takes twice as much disk space as before, the preprocesseed data can take even 5 times or more than that, considering that L and R tracks are perfectly identical, is that procedure truly necessary? If stereo is a must and cannot be helped, can't the training procedure pretend that L is R? (ie. audioL = 'file.wav', audioR = audioL) My dataset have 200GB 22khz mono wav files, converting to stereo takes 400GB, converting to 44Khz takes 800GB (I'm AI upscaling), at 30% preprocessing my 2TB fast SSD ran out of space completely, next time I'll skip the 44Khz conversion and try to train directly at 22Khz, but I'm afraid I still don't have enough space. I can use a 4TB mechanical drive but I suspect the training will be slower (still better than nothing). Do you think training on mono or optimizations are possible at all? That could save a HUGE amount of disk space.

tsurumeso commented 3 years ago

Sorry, training on mono dataset is not supported.