2noise / ChatTTS

A generative speech model for daily dialogue.
https://2noise.com
Other
26.77k stars 2.91k forks source link

mps Infer result is incorrect, it's noise. But it's ok in cpu infer. #444

Closed sizeofbeer closed 1 week ago

sizeofbeer commented 1 week ago

型号名称:MacBook Pro 芯片: Apple M1 Pro 核总数:8(6性能和2能效) 内存:16 GB

This problem appears after that branch "8235a46711ef738387ec17604a7e73f674930719"

fumiama commented 1 week ago

Can you paste your code? I'm also using mps but cannot reproduce your problem.

sizeofbeer commented 1 week ago

I did not modify the code, only used example/cmd/run.py

fumiama commented 1 week ago

Can you paste it?

sizeofbeer commented 1 week ago
import os, sys

if sys.platform == "darwin":
    os.environ["PYTORCH_ENABLE_MPS_FALLBACK"] = "1"

now_dir = os.getcwd()
sys.path.append(now_dir)

import wave
import argparse

import ChatTTS

from tools.audio import unsafe_float_to_int16
from tools.logger import get_logger

logger = get_logger("Command")

def save_wav_file(wav, index):
    wav_filename = f"output_audio_{index}.wav"
    with wave.open(wav_filename, "wb") as wf:
        wf.setnchannels(1)  # Mono channel
        wf.setsampwidth(2)  # Sample width in bytes
        wf.setframerate(24000)  # Sample rate in Hz
        wf.writeframes(unsafe_float_to_int16(wav))
    logger.info(f"Audio saved to {wav_filename}")

def main(texts: list[str]):
    logger.info("Text input: %s", str(texts))

    chat = ChatTTS.Chat(get_logger("ChatTTS"))
    logger.info("Initializing ChatTTS...")
    if chat.load():
        logger.info("Models loaded successfully.")
    else:
        logger.error("Models load failed.")
        sys.exit(1)

    wavs = chat.infer(texts, use_decoder=True)
    logger.info("Inference completed. Audio generation successful.")
    # Save each generated wav file to a local file
    for index, wav in enumerate(wavs):
        save_wav_file(wav, index)

if __name__ == "__main__":
    logger.info("Starting the TTS application...")
    parser = argparse.ArgumentParser(
        description="ChatTTS Command", usage="--stream hello, my name is bob."
    )
    parser.add_argument(
        "text", help="Original text", default="YOUR TEXT HERE", nargs="*"
    )
    args = parser.parse_args()
    main(args.text)
    logger.info("TTS application finished.")
sizeofbeer commented 1 week ago

python ./examples/cmd/run.py "四川美食确实以辣闻名,但也有不辣的选择。比如甜水面、赖汤圆、蛋烘糕、叶儿粑等,这些小吃口味温和,甜而不腻,也很受欢迎。"

fumiama commented 1 week ago

Here's my output of the latest commit.

截屏2024-06-25 下午6 21 15

Here's the result.

output_audio_0.wav.zip

Seems all right.

sizeofbeer commented 1 week ago

I fixed the issue on my PC by adding the following code:

        del_all(logits_warpers)
        del_all(logits_processors)

Code Localtion: ChatTTS/core.py: 571

fumiama commented 1 week ago

I fixed the issue on my PC by adding the following code:

        del_all(logits_warpers)
        del_all(logits_processors)

Code Localtion: ChatTTS/core.py: 571

Thanks. I knew the real issue. It about the lru_cahce.

fumiama commented 1 week ago

You can try the latest commit.