Support 32bit float output

tsurumeso / vocal-remover

Vocal Remover using Deep Neural Networks

MIT License

1.58k stars 228 forks source link

Support 32bit float output #34

Closed pburgmer closed 3 years ago

pburgmer commented 4 years ago

For further audio editing it would be best to not reduce the bit depth. Is it possible to write the output with 32bit float?

aufr33 commented 4 years ago

sf.write('{}_Instruments.wav'.format(basename), wav.T, sr, 'FLOAT')

and

sf.write('{}_Vocals.wav'.format(basename), wav.T, sr, 'FLOAT')

pburgmer commented 4 years ago

Thanks. Would be great to have a command line option for that. Without editing the source code. But helps a lot to know how to do it.

Daniel-Ventura81 commented 4 years ago

how about preserving the input sample rate too? i altered the code to 96khz output and put in a hi res/96hz source file but the result was absolutely not usable. just searching a way to prevent downsampling and keep the base sample rate in output with 32bit float.

aufr33 commented 4 years ago

@Daniel-Ventura81

python inference.py --input "SONG.wav" --gpu 0 -m models/baseline.pth --sr 96000

UPD: When using a sample rate other than 44100, the processing will go the same as if you stretched / compressed the audio. This leads to a significant deterioration in quality, so I recommend doing it differently: use a sample rate of 44100 Hz and mix the resulting acapella out of phase with the original 96-KHz file.

v-nhandt21 commented 1 year ago

@aufr33 , Can I make clear, you mean that we should:

Input 44100 -> librosa load 44100 -> model -> out 44100 -> save at 96k

Input 44100 -> librosa load 96k -> model -> out 96k -> save at 96k