Closed jacobtang closed 1 week ago
No, just leave it as it is. The feed_audio method takes care of the buffer size.
You just need to ensure that the input chunks are PCM raw data and 16000 Hz sample rate:
from scipy.signal import resample
def decode_and_resample(
audio_data,
original_sample_rate,
target_sample_rate):
# Decode 16-bit PCM data to numpy array
audio_np = np.frombuffer(audio_data, dtype=np.int16)
# Calculate the number of samples after resampling
num_original_samples = len(audio_np)
num_target_samples = int(num_original_samples * target_sample_rate /
original_sample_rate)
# Resample the audio
resampled_audio = resample(audio_np, num_target_samples)
return resampled_audio.astype(np.int16).tobytes()
resampled_chunk = decode_and_resample(chunk, sample_rate, 16000)
recorder.feed_audio(resampled_chunk)
Thanks! Can feed_audio with 16k stereo data?the raw data is 48k,stereo data,when i call decode_and_resample,the audio data is 16k stereo. In my realtime server,I can not get recorder.text() in a loop by using thread,may be the feed data is not correct.
Should be mono 16000 Hz, 16 Bit, PCM
BUFFER_SIZE = 512 self.buffer_size = BUFFER_SIZE feed_audio(self, chunk): when I call feed_audio,the input data size is 640/768(16k,mono) from our realtime server,should I change the buffer_size (512) in the audio_recorder? Thanks!