Closed Picus303 closed 1 year ago
The audio data returned by scipy.io.wavfile
must be converted to float and normalized to [-1,+1].
Thanks, it worked. It is an important information and I can't find it in the ReadMe. Maybe you should add it. Here is the corrected code:
from stftpitchshift import StftPitchShift
from scipy.io import wavfile
import numpy as np
samplerate, data = wavfile.read('voice.wav')
data = data.astype(np.float16)
max_val = np.max(np.abs(data))
data = data/max_val
pitchshifter = StftPitchShift(1024, 256, samplerate)
new_data = pitchshifter.shiftpitch(data, 1.2)
new_data = (new_data*max_val).astype(np.int16)
wavfile.write("edited.wav", samplerate, new_data)
Your snippet looks incorrect, I would prefer this one instead:
from stftpitchshift import StftPitchShift
from scipy.io import wavfile
import numpy as np
samplerate, data = wavfile.read('voice.wav')
# convert original integer data type to a normalized float data type
# (unless it's already a normalized float)
dtype = data.dtype
scale = np.iinfo(dtype).max ** -1
data = data.astype(np.float32) # use at least float32
data = data * scale
pitchshifter = StftPitchShift(1024, 256, samplerate)
data = pitchshifter.shiftpitch(data, 1)
# convert result to the desired integer data type
# (or keep it as is)
dtype = np.int16
scale = np.iinfo(dtype).max
data = data.clip(-1, +1) # preventively avoid clipping
data = (data * scale).astype(dtype)
wavfile.write('edited.wav', samplerate, data)
I tried to do the most simple implementation I could think of but still get a broken result. Is there something I should know that isn't in the ReadMe? Here is my code:
Based on what I understood, this code is not even supposed to modify the audio.