rendchevi / nix-tts

🐀 Nix-TTS: Lightweight and End-to-end Text-to-Speech via Module-wise Distillation
MIT License
233 stars 31 forks source link

Replace the IPython audio player with file writer in example #6

Open fquirin opened 2 years ago

fquirin commented 2 years ago

I just tested nix-tts on Raspberry Pi 4, pretty impressive :+1: Realtime factor is 0.5 btw (2x faster than realtime), but I had some trouble writing the audio buffer into a file because the example depends on IPython which is a) not in the requirements and b) probably meant for Jupyter or Huggingface (?) not some local test.

I tried to replace it with wave (because it is lightweight) like this:

import wave
...
wf = wave.open('test.wav', 'wb')
wf.setnchannels(1)
wf.setsampwidth(2)
wf.setframerate(22050)
wf.setnframes(len(xw[0,0]))
wf.writeframesraw(xw[0,0].tobytes())

I does work somehow but there is obviously something wrong in the encoding since I'm getting mostly noise from that code.

Since I couldn't solve the issue I gave up and used: scipy.io.wavfile.write("test.wav", 22050, xw[0,0]) in the end. It works but you have to install a bazillion more dependencies which takes forever on RPi4.

So can you recommend any working alternative to scipy (which is not librosa ^^)?

janzhen commented 1 year ago

I have tested the code below and it seems to be working.

import wave
...
with wave.open('output.wav', 'wb') as wav_file:
    wav_file.setnchannels(1)
    wav_file.setsampwidth(2)
    wav_file.setframerate(22050)
    wav_file.writeframes((2 ** 15 * xw).astype(np.int16).tobytes())
fquirin commented 1 year ago

Works like charm πŸ™‚ πŸ‘

What does this do exactly (2 ** 15 * xw)?

janzhen commented 1 year ago

xw is float normalized to [-1, 1), and 2 * 15 xw scale it to int16 range.

Florian Quirin @.***>于2023εΉ΄3月9ζ—₯ ε‘¨ε››δΈ‹εˆ7:37ε†™ι“οΌš

Works like charm πŸ™‚ πŸ‘

What does this do exactly (2 * 15 xw)?

β€” Reply to this email directly, view it on GitHub https://github.com/rendchevi/nix-tts/issues/6#issuecomment-1461864818, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKRPL27R27GIDO3JNBO64LW3G6GVANCNFSM5V4UXYXQ . You are receiving this because you commented.Message ID: @.***>

-- Sincerely, Zhen Zhijian