2noise / ChatTTS

A generative speech model for daily dialogue.
https://2noise.com
Other
26.87k stars 2.92k forks source link

How to save output to a wav or mp3? #41

Open SoftologyPro opened 1 month ago

SoftologyPro commented 1 month ago

For your example, how do I save the result to a wav or mp3 file?

import ChatTTS
from IPython.display import Audio

chat = ChatTTS.Chat()
chat.load_models()

texts = ["This is a test fo the ChatTTS script.  Peter Piper picked a peck of pickled peppers.  Red leather.  Yellow leather.  Red leather.  Yellow leather.  Red leather.  Yellow leather.",]

wavs = chat.infer(texts, use_decoder=True)
Audio(wavs[0], rate=24_000, autoplay=True)

For Windows, I had to use the following packages and versions for the environment so the script ran without errors. This may help other Windows users.

python -m pip install --upgrade pip
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts wheel==0.38.4
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts omegaconf==2.3.0
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts tqdm==4.66.4
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts einops==0.8.0
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts vector_quantize_pytorch==1.14.24
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts transformers==4.41.1
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts vocos==0.1.0
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts soundfile==0.12.1
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts ipython==8.24.0
pip uninstall -y torch
pip uninstall -y torch
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts torch==2.3.0+cu118 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip uninstall -y charset-normalizer
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts charset-normalizer==3.3.0
wa008 commented 1 month ago

try this code https://github.com/2noise/ChatTTS/pull/35/files

SoftologyPro commented 1 month ago

Thanks, that works.

import ChatTTS
import scipy
from IPython.display import Audio

chat = ChatTTS.Chat()
chat.load_models()

texts = ["This is a test of the ChatTTS script.  Peter Piper picked a peck of pickled peppers.  Red leather.  Yellow leather.  Red leather.  Yellow leather.  Red leather.  Yellow leather.",]

wavs = chat.infer(texts, use_decoder=True)
Audio(wavs[0], rate=24_000, autoplay=True)
scipy.io.wavfile.write(filename = "output.wav", rate = 24_000, data = wavs[0].T)
MadmanXML commented 1 month ago

This will work as well.

import ChatTTS
from IPython.display import Audio

chat = ChatTTS.Chat()
chat.load_models()

texts = ["This is a test of the ChatTTS script.  ",]

wavs = chat.infer(texts, use_decoder=True)
Audio(wavs[0], rate=24_000, autoplay=True)
audio_data = Audio(wavs[0], rate=24_000, autoplay=True).data
with open("speech.wav", "wb") as f:
    f.write(audio_data)