AudioArrayClip processing sample rate wrongly

arturstopa commented 6 months ago

When passing a wav file represented as numpy.ndarray to moviepy.audio.AudioClip.AudioArrayClip the sound gets distorted and twice as long. When saving the same wav sound to file using scipy and then loading it with moviepy.audio.io.AudioFileClip.AudioFileClip the audio is fine. Playing back the original audio is also fine.

In the following snippet: tts_out.wav and test-audio-indirect-i16.wav are a 5 second sound, without any distortions and artifacts. test-audio-direct-i16.wav is a 10 second, highly distorted audio. When passing fps = 2*TTS_OUTPUT_SAMPLERATE the audio has correct length and while words are recognizeable, audio is still highly distorted.

wav = tts.synthesize(TEST_TEXT).reshape(-1,1)
scipy.io.wavfile.write("tts_out.wav", TTS_OUTPUT_SAMPLERATE, wav)

audio = AudioArrayClip(wav, fps = TTS_OUTPUT_SAMPLERATE)
audio.write_audiofile("test-audio-direct-i16.wav")

audio = AudioFileClip("tts_out.wav", fps = TTS_OUTPUT_SAMPLERATE)
audio.write_audiofile("test-audio-indirect-i16.wav")

Specifications

Python Version: 3.11.7
MoviePy Version: 1.0.3
Platform Name: Windows Server 2019
Platform Version: -

SohamTilekar commented 6 months ago

It is not the Bug Your Code is Wrong Close the issue Problematic code: - you provide the TTS_OUTPUT_SAMPLERATE while crating AudioArrayClip instead of the _TTS_INPUTSAMPLERATE

audio = AudioArrayClip(wav, fps = TTS_OUTPUT_SAMPLERATE)
audio.write_audiofile("test-audio-direct-i16.wav")

arturstopa commented 4 months ago

Where do I get the TTS_INPUT_SAMPLERATE from? tts.synthesise() generates audio with sample rate equal to TTS_OUTPUT_SAMPLERATE, there are no other sample rates that I'm aware of. Maybe I should've pointed out that TTS stands for Text To Speech. Also note that when loading the same audio clip with the same sample rate with AudioFileClip class it's working correctly.

Zulko / moviepy

AudioArrayClip processing sample rate wrongly #2086

Specifications