2noise / ChatTTS

A generative speech model for daily dialogue.
https://2noise.com
Other
27.01k stars 2.94k forks source link

support stream mode #360

Closed Ox0400 closed 2 weeks ago

Ox0400 commented 2 weeks ago

usage

stream = True
wavs_gen = chat.infer('text here', stream=stream)

if stream:
     wavs = [np.array([[]])]
     for gen in wavs_gen:
         print('new chunk gen', gen)
         wavs[0] = np.hstack([wavs[0], np.array(gen[0])])
 else:
     print('check result', wavs_gen)
     wavs = wavs_gen

torchaudio.save("output1.wav", torch.from_numpy(wavs[0]), 24000)
Ox0400 commented 2 weeks ago

@fumiama Thanks for you fixed my abnormal code. I'm trying add the tests to examples/cmd

fumiama commented 2 weeks ago

You're welcome. You can change the webui and cmd directly to the stream mode in order to support long-sentence infer.

Ox0400 commented 2 weeks ago

You're welcome. You can change the webui and cmd directly to the stream mode in order to support long-sentence infer.

Okay, For cli mode, PR: https://github.com/2noise/ChatTTS/pull/366

web mode is some complexity, just the first chunk need WAV header bytes. PCM mode will be better.