jiaaro / pydub

Manipulate audio with a simple and easy high level interface
http://pydub.com
MIT License
8.89k stars 1.05k forks source link

Raw audio sound ocasionally lacks one number #520

Open NEGU93 opened 4 years ago

NEGU93 commented 4 years ago

I am using this library to open my UrbanSound8K wav files. My plan is to make all audios 4 seconds so I use the overlay function over a silence segment. This is my code.

def _load_audio(path, duration=4000):
    silence = pydub.AudioSegment.silent(duration=duration)
    audio = silence.overlay(pydub.AudioSegment.from_wav(path))
    samples = audio.set_frame_rate(22050).split_to_mono()[0].get_array_of_samples()
    raw = np.array(samples).astype(np.float32)
    raw /= np.iinfo(samples.typecode).max
    if len(raw) == 88199:
        print("I'm here")
        raw = np.append(raw, 0.0)

However, some times I get raw audios of length 88199 instead of 88200. This should not happen as I do silence of 4 seconds directly. In my case, I fix it with that horrible if len(raw) == 88199: but something tells me this should not happen.

jiaaro commented 4 years ago

This is due to rounding error in the frame rate conversion. Try specifying the frame_rate when you create the silent segment.