Migrate to using librosa and librubberband for formant shifting. This would make the male and most importantly female voices, much more accurate and allow even more vocal flexibility in the future, since we could easily implement phaser effects and vocoders using librosa later on when we have the groundwork for a low-level streaming solution, rather than relying on sox.
I've played around, and it seems to be possible to use librubberband from Python and read data from a wave file (called bad.wav), apply a pitch shift effect with formant shifting and write it to "good.wav".
import soundfile as sf
import pyrubberband as pyrb
y, sr = sf.read('/home/char/bad.wav')
shifted = pyrb.pitch_shift(y, sr, n_steps=2.4)
sf.write('good.wav', shifted, sr)
The code to achieve this for a wave file is above, however in order to make this realtime we will need to do something like the following.
Read from the microphone input, and stream into a buffer/numpy array(?)
Process the buffer whenever new data is added to it, and apply the pitch shift effect
Write this new buffer out to a Pulse null output, like before in sox only this time writing the numpy array
instead of piping directly from sox
Migrate to using
librosa
andlibrubberband
for formant shifting. This would make the male and most importantly female voices, much more accurate and allow even more vocal flexibility in the future, since we could easily implement phaser effects and vocoders usinglibrosa
later on when we have the groundwork for a low-level streaming solution, rather than relying onsox
.I've played around, and it seems to be possible to use
librubberband
from Python and read data from a wave file (called bad.wav), apply a pitch shift effect with formant shifting and write it to "good.wav".The code to achieve this for a wave file is above, however in order to make this realtime we will need to do something like the following.
numpy
array(?)sox
only this time writing thenumpy
array instead of piping directly fromsox